Below you will find pages that utilize the taxonomy term “Data Papers”
This is the next installment on my quest to read and help understand interesting papers in the data space.
A new entry on the data papers series. Ray is a distributed framework for next generation AI applications. What does this mean? A scam? Blockchain on AI? Nah, it’s actually pretty cool, it has actors.
It has been a while since my previous data paper. This time I tackle a less known one.
Lakehouse is the brand name for the underlying architecture of Databricks' Delta Lake: A data lake that is as performant as a data warehouse.
Hive is arguably old. It is also undoubtedly useful, even now: 10 years after it was introduced.
This is the next instalment on my quest to read and help understand interesting papers in the data space.
After reading the Snowflake paper, I got curious about how similar engines work. Also, as I mentioned in that article, I like knowing how the data sausage is made. So, here I will summarise the Delta Lake paper by Databricks.
I didn’t know much about Snowflake, so I decided to have a look at its SIGMOD (ACM Special Interest Group on Management of Data) paper and investigate a bit more what special capabilities they offer, and how they compare to others.