Below you will find pages that utilize the taxonomy term “Data”
I am a big fan of Concept maps, but writing them in Graphviz is annoying. So I wrote a helper.
I know, I have been silent for quite a while.
Had some stuff going on that ate all my available time.
Feeling less tired this weekend.
Playing with stable diffusion on your own machine is great.
Slowly, very slowly, cleaning up my reading list.
Slightly shorter because I’ll be on holidays.
This is the next installment on my quest to read and help understand interesting papers in the data space.
Spent a week in Switzerland… and on the flight back caught COVID.
A new entry on the data papers series. Ray is a distributed framework for next generation AI applications. What does this mean? A scam? Blockchain on AI? Nah, it’s actually pretty cool, it has actors.
Shit ain’t gettin' better.
This has been a really tough week.
I was on J on the Beach, so skipped last weekend.
A middle-of-the-week one because it’s Easter and I may not write this during days off.
It has been a while since my previous data paper. This time I tackle a less known one.
Lorem ipsum dolor sit amet
This turned out a long one
Trust between business stakeholders and engineering (and data, analytics, operations…) teams is a tricky matter.
Meetings galore.
Слава Україні! Героям слава!
The dbt issue
RIP Michael (originally Marvin) Lee Aday, Meat Loaf 🤘
Haven’t read much these days, but luckily I have not added much to the list either.
Another week gone by, with a long list of readings seen pass.
Happy new year!
Christmas edition!
End of year cleanup, so a lot of goodies this time
In which I write some easy Alloy code for a data model, with change over time.
An unusual collection of links.
As usual, skipping an edition means a longer collection later on.
A 2-week’s worth of readings means a longer than usual list, as usual.
This past week we were on holidays 🎉.
Days of fire Kafka and thunder SSL.
This Apache Spark feature has made us scratch our heads way too much.
I had a very entertaining week.
A week on holidays (in-between jobs), where I read more books than articles.
Next week I start a new job 😮
This past week I’ve been on holidays in Cordoba. I put on 2.5kg in 4 days. Recommended.
This past week has been Data+AI Summit, so there are several new product announcements from Databricks.
In which I write some easy Alloy code for a data model.
I have spent a big deal of these weeks moving my notes from Bear to Obsidian. I may write the reasons at some point, stay 🐟.
Timezones and UTF are rocks you repeatedly hit in your data journey.
Looks like my mojo is coming back.
This edition is kind of strange: there’s more management than “code”.
Not sure what I did this past week aside from finishing a post: I read very little.
Lakehouse is the brand name for the underlying architecture of Databricks' Delta Lake: A data lake that is as performant as a data warehouse.
This is also a video-heavy edition, I keep chugging along my watch list with Glancer
Hive is arguably old. It is also undoubtedly useful, even now: 10 years after it was introduced.
The video edition
As promised, the numbering of these posts is now year-indexed.
Managing logging in Spark ain’t easy, and is even harder in managed clouds like Databricks or EMR.
Finishing and posting this got lost in a task manager reorganisation, it was due in June-July.
My reading list is at less than 10 items, so now my readings posts will hopefully look closer to watchings. My watch-later list is at more than 90.
I have been on holidays this week, playing VR and preparing videos for an online Python event I co-organise.
This looks like a less hard technical edition than usual.
A relatively common type of query for time-based SQL tables is a find the gap query. How can you do this in AWS Redshift, which does not have the SQL function generate_series
?
This is a short edition.
Finished that post, now started the next. And having all Fridays off is awesome.
This is the next instalment on my quest to read and help understand interesting papers in the data space.
I am having a very hard time finishing my summary of the RDD paper.
I have played a ton of virtual table tennis this week.
I have read quite a bit this week, I’m also preparing a summary of the RDD paper.
After reading the Snowflake paper, I got curious about how similar engines work. Also, as I mentioned in that article, I like knowing how the data sausage is made. So, here I will summarise the Delta Lake paper by Databricks.
I have dropped the Weekly from the title. It was about time.
The one where Airflow messes with you.
I didn’t know much about Snowflake, so I decided to have a look at its SIGMOD (ACM Special Interest Group on Management of Data) paper and investigate a bit more what special capabilities they offer, and how they compare to others.
This is a bit late because I have automated something.
Although this week I have been reading mostly Apache Cassandra documentation, I have tried to avoid an onslaught of tips, tricks and readings on it. Just one article.
I have been on quite the hiatus, making this more of a readings of the month edition. Sorry!
I have been pretty busy lately, and although reading doesn’t stop, my writing sometimes takes a hiatus.
Data engineering, adtech, history, apple. Expect a similar wide range in the future as well. You can check all my weekly readings by checking the tag here . You can also get these as a weekly newsletter by subscribing here.
The map and problem described here were part of my presentation Mapping as a tool for thought, and mentioned in my interview with John Grant and Ben Mosior (to appear sometime soon in the Wardley Maps community youtube channel). I’m looking for ideas on how to make this map easier to understand and useful, so I posted it to the Wardley Maps Community forums requesting comments.
_This week is a bit light on technical content because I was attending Scala Days 2019 in Lausanne and I had enough with the talks. _
Software engineering, psychology, history. Expect a similar wide range in the future as well. You can check all my weekly readings by checking the tag here . You can also get these as a weekly newsletter by subscribing here.
You know how you slip once on a habit and everything goes crazy? Well, I’ve been 4 weeks without writing these, so here’s the accumulated reading from 4 weeks. Because, even if I don’t write it, I read a lot anyway. Also, there’s lot of interesting content this “week”.