Welcome! I’m Ruben Berenguel, the mind behind the writing here at
mostlymaths.net. I have a PhD in Mathematics and work as a Lead Data Engineer.
I have a varied set of interests, that range from data engineering, management (my current job) to bespoke shoemaking (a previous job). You can imagine how much fits in-between.
You can reach me via:
You can find code I have written or contributed to, and slides for talks I have given on
subscribe to my newsletter, where I post links to interesting articles in the areas of programming, data engineering and miscellanea. You can read previous editions online on this tag.
You can explore some of my interests by checking the
interactive D3 sitemap of this site (in beta).
If you are interested in what technologies, fonts, etc are used in this blog you can read the
I have three research publications you likely are not interested in:
Data+AI Summit 2021
In this talk I explain how Hybrid Theory leverages Spark and GraphFrames to construct and maintain a 2000 million node identity graph with minimal computational cost.
An introductory workshop with Jupyter Notebooks on how Spark works and the basics of PySpark.
Spark+AI Summit Europe
A dive into the details of how PySpark and Spark communicate, and the improvements Apache Arrow brings to this for Python developers.
I split a session with
Yvan Phelizot for people interested in formal verification. I introduced TLA+ with the slides from last year’s Scala Exchange and showed some examples of Alloy and he talked about proves in Coq.
A lightning talk on how to use a TLA+ model to verify an Akka actor design.
Python Barcelona meetup
The first iteration on my talk about the internals of PySpark and Spark. Mind the
F word in the video
Joint talk with Carlos Peña, introducing Spark’s concepts.