Below you will find pages that utilize the taxonomy term “Data Papers”
Lakehouse is the brand name for the underlying architecture of Databricks’ Delta Lake: A data lake that is as performant as a data warehouse.
Hive is arguably old. It is also undoubtedly useful, even now: 10 years after it was introduced.
This is the next instalment on my quest to read and help understand interesting papers in the data space.
After reading the Snowflake paper, I got curious about how similar engines work. Also, as I mentioned in that article, I like knowing how the data sausage is made. So, here I will summarise the Delta Lake paper by Databricks.
I didn’t know much about Snowflake, so I decided to have a look at its SIGMOD (ACM Special Interest Group on Management of Data) paper and investigate a bit more what special capabilities they offer, and how they compare to others.