2020#57 Readings
5 minutes read | 976 words by Ruben BerenguelI have dropped the Weekly from the title. It was about time.
I have skipped a week, due to writing a couple of long posts (links below) and decided to finally drop the weekly part of these Readings.
đź“Ż Summary of Good Strategy/Bad Strategy
I have had this post a long time in the making (I finished the book a few months ago) but I had a lot of notes and I didn’t know how to organise them. I hope the end result reads well.
đź“Ż Running SparkSQL on Databricks via Airflow’s JDBC operator
If you have ever wanted to run SQL from Airflow in a Databricks cluster, this will tell you how. And yes, I still don’t like Airflow.
đź“Ż Does Snowflake have a technical moat worth 60 billion?
I wanted to condense several sources into a single post, while answering the question in the title. On the technical side, this works as a summary of the Snowflake SIGMOD paper.
Strategic Deriving
This is an extremely detailed explanation of how deriving works and is used in Haskell. Probably more than you need to know, but it is a very interesting subject.
Cooking Classes with Datatype Generic Programming
I’m still trying to wrap my head around what you can do with a Generic
instance in Haskell. Getting there.
The Science Behind WFH Dressing for Zoom
The research showed that the combination of wearing certain clothes and their symbolic meaning led to more focused attention
X-COM at The Digital Antiquarian
I never played the original X-COM, but played a fan-made version on iOS some years ago. And it really was hard. I’m tempted now to fire up the iOS version of the rebooted franchise just for the kicks.
🍿 Raycasting engine in Factorio 1.0 (unmodded)
I only know of Factorio from tweeps that talk about it from time to time (so I know the general idea). This is some high level low level engineering.
Annoying things in Scala 2 that’ll be (mostly) gone in Scala 3
A list of what’s coming for Scala 3. Soon!
Asteroids: By the Numbers
An analysis of the game Asteroids. Size of the elements, speed, design decisions.
Under the hood of Spark performance, or why query compilation matters
I read the SparkSQL paper some time ago so I may be misremembering, but I think there is a mistake here. SparkSQL still uses a Volcano execution model, it just happens to squash into whole-stage code generation phases as much as possible. The model is the same, but the code is flattened or pushed down as much as possible. I need to re-read the paper for a project I want to write, so I may update this comment at some point in time.
Keeping your data pipelines healthy with the Great Expectations GitHub Action
Someday I will have enough time to add Great Expectations to some projects
Fooling Around with Foveated Rendering
I expected the results to be better, but I suspect it’s due to the subject not being in foveal focus during the animation.
Grapefruit Is One of the Weirdest Fruits on the Planet
If you eat grapefruit, this will scare you shitless.
Python 3.9: Cool New Features for You to Try
Topological sort, annotated types, nicer generics… and teapot status codes. And a few more.
Color blindness
Rob Pike (yes) talking about what color blindness implies and how it happens (and what effects it has on him). Pretty interesting (similarly to the pomelo/citron stuff)
Adaptive Query Execution (AQE) in Spark 3.0
I had somehow forgotten AQE was a thing. I wonder if probe-side optimisation in distributed hash joins could be added to AQE…
Making GHCIDE smarter and faster: a fellowship summary
Looking forward for when this is available on ghcide.
The Gamer/Arbitrageur to Generalist Pipeline
I found this week’s free edition of The Diff newsletter particularly interesting.
Get started with machine learning on Arduino
Hobby level machine learning tools is what will eventually open AI for everybody
Redshift, Snowflake, and the Two Philosophies of Cost
This short article about the economic space between Redshift and Snowflake is a good complement to my post above.
Spark Structured Streaming in K8s with Argo CD
A very detailed post by Albert about how they use K8s and ArgoCD at Typeform.
The Giant Pink Bunny at Colletto Fava
These are some distressing pictures.
Resilience and Vibrancy: The 2020 Data & AI Landscape
Pretty complete, although a bit long winded.