Architecting a Modern Data Lake Using Object Storage
12 Oct 2022
Data Integration & Management Theatre
The modern data lake is a dynamic ecosystem of compute, storage, networking, table formats, applications and workloads. What are the architectural considerations required to deliver the performance at scale required for big data/AI/ML workloads? In this talk, MinIO will outline a data lake reference architecture that meets the throughput and scale demands of modern enterprises with a focus on the following areas:
1. Separation of compute and storage
2. Disaggregation of monolithic frameworks to best of breed frameworks
3. Seamless performance across small and large files/objects
4. Software defined, cloud native solutions that scale horizontally