2021#02 Readings5 minutes read | 915 words by Ruben Berenguel
The video edition
This is a video heavy edition. Over the past “Readings of the week” years I have managed to reduce my pending readings from 392 posts to 4 (right now I have 12). I more or less have been tracking the length of this list weekly:
This was well and good… But I hadn’t tracked my “videos pending to watch”. The pandemic made this grow enormously, even if I kept deleting videos from past conferences which were superseded by new conferences:
It hit more than 100 after the last Spark Summit, and even though I tried to watch more of them, a conference video takes 30 something minutes, even at 1.5x. And a lot of times, I realise it’s not that interesting halfway through. And very often, with the slides and a bit of the spoken context I have enough to see the point and run with it.
Haskell and automation to the rescue: I wrote a small helper utility that converts videos to static webpages with images and captions on the side (first link today). It took me a couple afternoons to do it (around 7 hours-ish?). That big drop at the end of the graph is my glancing over 18 videos, which probably took me 40 minutes total. Watch time would have been around 8 hours, so it has already paid in time saved and will continue to save me time.
Never underestimate what you can automate and how it can give you more time for other stuff.
📯 rberenguel/glancer: Glance over some technical videos
This is the cause of this being the video edition, and probably one of my most useful “small helper programs”. It takes a Youtube video URL, extracts frames every 30 seconds and creates a static webpage with the frames and Youtube’s automated captions. Makes breezing through technical videos super easy. You can see an example output here.
🍿 Thinking Outside the GIL with AsyncIO and Multiprocessing
I have used
aiomultiprocess for a high IO thingy, and it worked excellent.
📚 How to Write Short: Word Craft for Fast Times
I have been reading this concurrently with R.P. Clark’s Writing Tools, this one is lighter: I’m taking copious notes from Writing Tools.
🍿 How to Build a Functional API
I like the example of safe parsing of Excel workbooks. We had to do this for work in Python and… Safety goes BRRRR.
🍿 Dask-on-Ray: Using Dask and Ray to Analyze Petabytes of Remote Sensing Data
This seems like a powerful way to distribute even more Dask tasks, or at least have more fault tolerance for Dask. To be fair, I’d just use Dask, but Ray is an interesting project by itself.
🍿 Quantifying Quality for Engineers
This looks fascinating. Having objective formulas for quality is a move toward a better future in software engineering.
🍿 Ibis: Seamless Transition Between Pandas and Apache Spark
So, Ibis is a declarative query “frontend” that can use several backends (think Spark, Dask, others). Learn one API, run it in many places. This may sound familiar: it’s basically SQL over different query engines. The problem would then be the quality of the Ibis layer on top of a Spark executor, for instance.
🍿 Redash: Easily Visualize, Dashboard, and Share your Data
Redash on Databricks is looking good according to this talk. At work we have been given access, but so far we’ve had no time to check.
🍿 Quickstrom: Specifying and Testing Web Applications
I’ve been interested in Quickstrom since I saw Oskar talk about it. It’s a pity I don’t have (and am unlikely to have) to test any serious web application.
In-Depth: The Eerie Beauty Of The Apple Watch Solar Face, And The Anatomy Of Nightfall
I have added this complication to my rotation (which is usually either Meridian or Chronograph).
My year in data
Can’t argue with a quantified self person.
🐦 “If you have wondered how electric rice cookers know when to stop cooking…"
This is electrical engineering at its finest.
I have used Lucene and its descendants (Solr, Elasticache), but the benchmarks here make me think I should give this a shot for my next project involving search.
Wolfenstein 3D Classic
This story of how John Carmack ported Wolfenstein 3D to the iPhone is pretty awesome.
Apache Beam for Search: Getting Started by Hacking Time
I’m not 100% sure I got the example right from this article, even though I know more or less how the Beam model of time works. It’s probably me, though: this article is very well written.