How Open Source is Re-Shaping The Cloud Data Warehouse Landscape
In the last decade, the rise of the proprietary cloud data warehouse, led by platforms like Snowflake, BigQuery, and Redshift, has helped modernize data warehousing by providing scalability, convenience, and most importantly flexibility and openness to a very important class of data workloads. Once this data was available in the cloud, it was possible to use it for more use cases, including user-facing analytics, dashboarding, observability, machine learning, and so on. This led to recurrent performance challenges, a degraded user experience, significant runaway costs, and also — vendor lock-in. In this talk, we explore the role open source technologies (e.g. open source real-time analytical databases like ClickHouse) and open data lake standards (e.g. Iceberg, Hudi, Delta Lake) play in transforming the modern data stack and helping organizations move away from a monolithic cloud data warehouse.