2020#48 Readings of the Week
8 minutes read | 1548 words by Ruben BerenguelSpark 3 is here! Rejoice!
NOTE: Spark (quite a bit), Python, Golang, Maths and a bit of miscellanea. Expect a similar wide range in the future as well. You can check all my weekly readings by checking the tag here. You can also get these as a weekly newsletter by subscribing here.
How to Solve Non-Serializable Errors When Instantiating Objects In Spark UDFs
I guess it’s one of those “if you haven’t seen this issue you have not written enough Spark code yet”. Or you have impressive luck, congrats.
Apple Team Working on VR and AR Headset and AR Glasses
A friend just got an Oculus Quest and so far is very happy. I’m looking forward virtual terminals.
Five highlights on the Spark 3.0.0 Release
A quick summary of what you can find in Spark 3 from Thiago. I fully endorse his Python comment here.
No more ad-hoc requests! The journey from data service to data product organizations
This is, of course, assuming the organisation uses the tools you provide instead of not even paying attention.
Egyptian Karkade - Hibiscus Iced Tea Recipe
This is what we drink at home during summer (I’m sipping a glass right now). This makes it sound more complicated than it should. Steps:
- Buy dried hibiscus flowers and wait until they are delivered
- Place enough to cover the bottom of a glass bottle or heat/cold proof container (say, a 1 or 2 liter bottle) so you don’t see much of the bottom (tweak as needed after the first time you do)
- Boil 1/2 cups of water and pour them on step 2
- Let it rest until warm, fill fully and place in the fridge
- Enjoy, optionally with sugar.
If the flowers are large (which is the usual) you just need a bottle with a coarse filter, or just be a bit careful. No need to “do” much. It’s excellent for the heat, and if you happen to have pomegranates around, get a thin sliver of peel (without no white rind, important) and add to step 2. Optionally as well, add a lot of peel and some of the pomelo flesh for a bitter, tarter taste.
Go generics draft design: building a hashtable
I’m liking the proposal for Go generics now. This is what I thought they should be some time ago.
Testing PySpark Code
Matthew Powers (you may have visited his blog, Munging Data) has created a PySpark testing library that looks great. It has my PySpark seal of approval!
How the Nintendo Switch prevents downgrades by irreparably blowing its own fuses
Interesting information tidbit.
Databricks was ready for the recession. Now the start-up is poised to go public next year
I use Databricks at work (and know a few people there). It is an excellent product, but… are open-source-turned to companies really going to be great/shape the future/not close? There are not that many examples to draw from, and most I can think are either very recent (Confluent, Databricks) or not very large/doing well (Lightbend). And that’s not counting Hortonworks/Cloudera.
Copying Better: How To Acquire The Tacit Knowledge of Experts
A very interesting follow-up to part 1 (in last week’s newsletter/post). You may recognise the domain/blog from another previous newsletter, about the metagame.
Two years of micro-frontends: A retrospective
Such flexibility sounds like a nightmare for some people I know.
Hiring and the market for lemons
I vaguely remember reading this post a few years ago. It is still an interesting analysis on the developer market.
TornadoVM: Accelerating Java with GPUs and FPGAs
A microkernel and VM on top of Graal to generate OpenCL code for the JVM. That’s a mouthful.
Want to Be More Productive? Try Doing Less.
There are some interesting ideas in this.
Using JAX, numpy, and optimization techniques to improve separable image filters
This is a very understandable explanation about how to use automated differentiation (a very powerful technique) to optimise. And in particular, how to use the JAX library (an almost numpy drop-in replacement with built in automated differentiation in CPUs and GPUs).
How are Unix pipes implemented?
This is probably way more than you want to know about how pipes are implemented. Trust me.
What Is the Geometry of the Universe?
Pretty good explanations and diagrams, as usual with Quanta.
New Math Measures the Repulsive Force Within Polynomials
Another article from Quanta. This and Nautilus are the only publications I read regularly, and for a reason: good writing and good science.
Writing New Image Hashing Algorithms to Help a Yearbook Teacher
Locality sensitive hashing! Unexpected, I wasn’t aware it can be used for images.
How we came to create a new image placeholder algorithm, BlurHash
The result looks very good. I need to try to write a version of this.
Not just CPU: writing custom profilers for Python
I wasn’t aware you could write your own profilers for Python. After reading this, you will know too.
How a Gang of Harmonica Geeks Saved the Soul of the Blues Harp
The harmonica is one of the instruments I own and play badly, and this is a very interesting tour through its history and renaissance. I also recommend you get one (they are very cheap) and learn to play some blues. Why not? Music is fun.
Marketing Yourself (without Being a Celebrity)
One of the best reads recently. Almost everybody is promoting themselves in a way or another. Do it well, or do it better. It can’t hurt (unless you are too obnoxious).
From chunking to parallelism: faster Pandas with Dask
We have experimented with Dask at work recently. We have liked what we have seen… But in the end we are defaulting to Spark/Pyspark/Pandas or plain ol' from multiprocessing import Pool
. Still, dask is good to have in your bag of tricks.
The George Foreman Grill Changed the Way Men Cook Forever
Times are a-changing, but in what I know from Europe, even college kids at that time could cook. We still have grills, though, but we call them sandwich-makers, and use them for that. Usually with Nutella.
An Algorithm for Compressing Space and Time
This is a very fancy title to explain how Gosper’s algorithm (Hashlife) is used to speed up tremendously the computations involved in Conway’s Game of Life. Hint: quadtrees.
Hypermodern Python 1: Setup, pyenv, poetry, click, requests
This is the first chapter in a 7-post long tour across modern practices with Python. This one is about setting up your device, use pyenv to install different versions of Python, using Poetry to manage dependencies and create a simple CLI application usinc Click and requests to fetch from an API.
🍿 Rust for Weld, a High Performance Parallell JIT Compiler
Every time I think I don’t really need to learn Rust I’m reminded of Weld. Although Weld is only going to be an internal piece in other frameworks, I need to understand how the sausage is made. You can read the slides here
🍿 Getting Real About Managing Up
The days of, put your headphones on, code for six months, throw it over the wall to the release manager, which, by the way, were not the good old days, I promise you, those are long gone.
🍿Sufficiently Advanced Testing
You can find a transcript here. This is an excellent introduction to property based testing and the Hypothesis Python library. And about advanced testing concepts (like metamorphic testing). Very high on my recommendation list this week.
Drawing good architecture diagrams
Diagramming is hard.
Building Finite State Machines with Python Coroutines
I have a project that involves coroutines. Some day. When I have time and energy. Then the world will be mine.
How to Use Generators and yield in Python
I keep referring to this article from Real Python. If you have never visited their site (I doubt it), check it out. Excellent content.
🍿 11 Levels of Origami: Easy to Complex
I’m not a fan of Robert Lang’s origami design (too complex, too many insects), but he’s not only a master at it but is also very good showing why. He has several books explaining how to construct very complex bases and crease patterns, and even wrote software for that. Very recommended video.
🔊 The Switch
Another book by the Heath brothers, and like others I have reviewed before, recommended. They are always entertaining and to the point, even if you have read/heard about the matter several times and know all the stories.
🔊 Total recall
Arnold Schwarzenegger’s autobiography. It’s pretty entertaining, and is quite a feel-good read/listen about how putting in the work pays back.
Buy me a coffee