2020#60 Readings
4 minutes read | 796 words by Ruben BerenguelI am having a very hard time finishing my summary of the RDD paper.
🍿 John Cleese on Creativity In Management
This entertaining talk will give you several hints, the one I’ll keep is thinking about open-closed mentality.
Talking, Typing, Thinking: Software Is Not a Desk Job
Optimising thinking time is underrated. Probably half of the value my company gets from me comes on the last day of a sprint, when there are no stories left and I have a few spare hours to think about areas of improvement instead of rushing to review PRs or finish my own. This is why slack (the concept, not the app/company) is so important.
Emerging Architectures for Modern Data Infrastructure
This is a very interesting analysis of the data landscape. You can see here why I see more future in Databricks than Snowflake: covers more areas, although Snowflake is planning to catch up.
Fixing the d3-zoom API… for my use-case
The d3-zoom demo in the post is excellent.
The story behind Markdown
I like the simplicity of Markdown. It’s not a format to write good, complex stuff with: it’s the format to just get it out with a passing legibility. Good enough is sometimes good enough.
Faster SQL: Adaptive Query Execution in Databricks
I have mentioned AQE before in this blog. It is a very important feature: imagine having a table that the statistics metadata said was 100 rows, planned to have a broadcast join but then it actually had 1 billion rows! This is what AQE fixes.
Make Your Own Damn Mental Models
There’s an interesting point here: mental models are based on narrative thinking
The remarkable number 1/89
This is a very surprising result with a very concise proof
Type in the exact number of machines to proceed
As mentioned in a Hackernews comment, this is like the point-and-call method used in Japan railroads. AWS actually does this for deletions, likewise for GitHub. It’s a pretty good practice.
Higher-kinded types in Python
This is amazing, terrifying and crazy at the same time. I’m looking forward for this PEP to go ahead.
Search engines & libraries: an overview
I missed seeing a mention of the tiny ones you can embed in a webpage (like lunrjs, or fusejs, which powers search here), but I found out many tools I may consider using when ElasticSearch/Lucene is overkill.
Predicting Football Results With Statistical Modelling: Dixon-Coles and Time-Weighting
Even if it doesn’t work that well in the end, the modelling approach is very interesting.
Surviving disillusionment
This is so important for developers. Burning out is way too easy, you have to find a way to keep your spirit shining.
Collaborative Single Player Mode
These are sensible rules, especially when working remotely. I think I’ll ~steal~ borrow some of these for our team.
🍿 Scaling Yourself
An old talk by Scott Hanselman, a great public speaker, developer and person in general.
Magnet: A scalable and performant shuffle architecture for Apache Spark
A project from LinkedIn Engineering to have better shuffling. I wonder what impact this can have in relatively small clusters, since in their case they have very large clusters.
Joins using LIKE or why PostgreSQL FTS is a powerful alternative
The indexing here looks like a piece of black magic, one I will add to my book of data related spells.
Historical English Thesaurus - Treemap
This is an interesting resource, although I’d love a dynamic visualisation that lets you browse through the language.
The cheap pen that changed writing forever
Pens are so versatile. If you are into drawing, I recommend you get The Art of Ballpoint.
🍿 Algorithmic Orchestration demo
I love playing music (electronic, real, whatever) even if I suck at all the instruments I play. I’d love to be able to do this with Tidal Cycles, so I’m looking forward to what Oscar comes up with.
SMASH: an efficient compression algorithm for microcontrollers
I will never work on microcontrollers (I’m not that smart) but special tools and techniques are useful in many areas. The more you know…