2020#58 Readings
4 minutes read | 739 words by Ruben BerenguelI have read quite a bit this week, I’m also preparing a summary of the RDD paper.
📯 Databricks' Delta Lake: high on ACID
I summarised the Delta Lake paper. Most people who have read it have enjoyed it, which is pretty great.
Trampolining and stack safety in Scala
This almost made me understand trampolines. I suspect a couple reads and I’ll have it.
Faster Canvas Pixel Manipulation with Typed Arrays
This is a very old post, and I don’t think I can use anything here for my generative, but it was a very interesting read.
How to Prepare for Software Engineering Interviews
There are many good resources here. Even if the subject is trite and you don’t care, the resources here are interesting just for their own sake.
Benchmarking the Performance of Amazon Redshift RA3.16xl
These look interesting, but given the minimal size/cost of a RA3 cluster, it does not exactly fill SMB (small/medium businesses) level of data.
Probabilistic Modeling with PRISM
I’m not sure I get how I’d use it, but since neither does Hillel (or did), I don’t feel as bad.
The Darien Scheme was one of history’s worst ideas
¯\﹍(ツ)﹍/¯
Delivering with Haskell
This is a very well-written post about best practices to introduce Haskell. I don’t think I disagree (or at least strongly disagree, I may argue some choice) with anything here.
Store and Access Time Series Data at Any Scale with Amazon Timestream
Yet another product from Amazon. At least they don’t “do a Google” with them.
Scala 3: New, but Optional Syntax
At first it looks as if the post is dismissive of the new syntax, but surprisingly, it is in favour. Personally, I like whitespace-based indentation. It feels correct.
rqlite/rqlite: The lightweight, distributed relational database built on SQLite.
This is an interesting approach to lightweight distributed data. In short, it’s SQLite with Raft.
The Pulsar Chart That Became a Pop Icon Turns 50: Joy Division’s Unknown Pleasures
What can I say, it’s a classic in data visualization. There’s also a very cool t-shirt for Scala lovers, by 47 degrees.
World’s smallest office suite
This is intended as a joke, but I wonder how much of a joke it really needs to be. I have tweaked them into my own version (I like fonts and margins): serif, monospace. You need to open them in a new tab manually (no idea why), and you can bookmark them afterwards (or just copy the link).
Nemo: Data discovery at Facebook
This sounds like a very good data discovery platform. I need to know more!
Unlock Your Productivity By Taking Better Notes
My biggest time sink is juggling development time and keeping on top of everything that is going on in the data area.
🍿 Haskell: Monads. A 5-minute introduction
This is probably the clearest explanation of monads ever. Or at least, the one that best matches how I’d explain it.
A Pi-Powered Plan 9 Cluster
At one point I had a Plan9 cluster with my Macbook (with Plan 9 from User Space), a very old laptop and a Raspberry Pi sharing a Venti fileserver. That was some serious yak shaving. This post has no Plan9 on it by the way.
How I Do Personal Experiments
This is about frameworks for evaluation and avoiding post-fact rationalisation. Worth reading if you like changing methods and self-improvement.
Animation of 14th century bridge construction in Europe
Not much more to add. It’s fascinating, would have never guessed.
Creating a Spark Streaming ETL pipeline with Delta Lake at Gousto
The Autoloader from Databricks looks interesting.
Time-series compression algorithms, explained
What’s in the title. And with examples, very well explained.