479: quantum of sollazzo
#479: quantum of sollazzo – 2 August 2022
The data newsletter by @puntofisso.
Hello, regular readers and welcome new ones :) This is Quantum of Sollazzo, the newsletter about all things data. I am Giuseppe Sollazzo, or @puntofisso. I've been sending this newsletter since 2012 to be a summary of all the articles with or about data that captured my attention over the previous week. The newsletter is and will always (well, for as long as I can keep going!) be free, but you're welcome to become a friend via the links below.
Every week I include a six-question interview with an inspiring data person. This week, I speak with Riccardo Saporiti, a freelance data journalist who's worked for Wired Italia and InfoData, the data initiative of Italian newspaper "Il Sole 24 Ore".
The most clicked link last week was the AI-driven clean up photo tool. Indeed, a few of you have written to say how useful it might be.
'till next week,
Giuseppe @puntofisso
Become a Friend of Quantum of Sollazzo → If you enjoy this newsletter, you can support it by becoming a GitHub Sponsor. Or you can Buy Me a Coffee. I'll send you an Open Data Rottweiler sticker. You're receiving this email because you subscribed to Quantum of Sollazzo, a weekly newsletter covering all things data, written by Giuseppe Sollazzo (@puntofisso). If you have a product or service to promote and want to support this newsletter, you can sponsor an issue. |
Six questions to...
Riccardo Saporiti
Photo Credits Umberto Costamagna
Topical
Voici qui vit dans les pires îlots de chaleur de votre ville
Or, as translated, "Here's who lives in your city's worst heat islands".
Original French and automatic English translation showing how the socio-economic divide in Canada is deepened by climate change.
TikTok story
"It seems like many of the emerging musicians to “make it big” are coming from TikTok. Here’s data to show what I mean."
Another engaging data-driven story by The Pudding.
Europe’s record summer of heat and fires – visualised
The Guardian graphics team visualised record temperatures and fires in Europe.
London hit 104 degrees
"That’s like 111 degrees in Boston."
Interesting approach by the Washington Post, using similarities to illustrate the changing climate.
El Pais did something similar (here's an automated translation into English), using data from a study published in PLOS ONE.
Black Districts Gutted as Suburban Flight Reshapes Congress Maps
"There are 22 majority-Black districts in the current Congress. Next year, there will be as few as nine."
La tierra seca, sin agua
A brilliant visualization of climate change in Mexico, originally in Spanish and here automatically translated into English: "The dry land, without water".
(h/t DataWrapper's DataVisDispatch)
Tools & Tutorials
OpenBBTerminal
This library provides an open source framework that looks like a Bloomberg terminal. This Twitter thread explains what it's for. They also have a website. As someone noted, though, it would be great to have the data, as opposed to a UI replica...
How to choose an interpolation for your color scale
Six Questions graduate Lisa Charlotte Muth of Datawrapper explains how she chooses the right colour interpolation, i.e. the way different data classes are smoothed according to a colour scale.
Choosing the right map type for your data
"Looking to represent your geographical data on a map? This post will help you pick the most effective map type for your data."
An article by Flourish on whether or not you should choose a map, and if so, which map.
3 Word2Vec articles
These three articles look at different aspects of Word2Vec, and seem helpful to really understand how it works:
What’s Behind Word2vec
Words into Vectors
The Word2vec Hyperparameters
They're part of a larger series, linked in the articles.
Finding errors in datasets with Similarity Search
Coming with a handy demo, this article illustrates how distance-based methods can be used to find errors in categorized datasets.
Machine Learning Algorithms with Python
Yes, all of them implemented and explained.
Pretty Maps in Python
Pretty good tutorial.
"The project only contains 425 lines of Python due to intense 3rd-party package use.
The map data is collected from OpenStreetMap using the OSMnx library. This library itself is made up of only 3,700 lines of Python as its relying on NetworkX. NetworkX wraps up functionality relating to complex networks and is made up of 78K lines of Python. NetworkX is largely the work of Aric Hagberg, an applied mathematician at the Los Alamos National Laboratory as well as Jarrod Millman who was once the release manager for both NumPy and SciPy.
Rendering is handled by vsketch, a project made up of 4.6K lines of Python and is based largely on the efforts of Antoine Beyeler, an entrepreneur based in Switzerland.
In this post, I'll walk through generating the following rendering of Tallinn's Old Town."
There's also a website allowing you to generate the maps (but don't abuse it, please).
We found the best JavaScript newsletter.
Bytes is probably the funniest web dev newsletter you'll ever read (trust me). If you like our newsletter, I've got a feeling you'll love Bytes too. There's a reason 100k developers read it every week.
Data thinking
The Only Algorithm for Hard Problems: Shake and Pull Gently
"(Or, “regularized greedy algorithms and their applications.”)"
Which prompts me to ask: is "switch if off and on again" a greedy algorithm?
"It occurs to me that there’s something I oughtta admit. It actually took me a bunch of tries to get that shot of the headphones coming untied. Most of the times I tried it didn’t work, or only worked partially, leaving me with some knots to undo by hand. Even if I can describe the technique in a simple way, actually applying it takes practice and skill.
That applies to all the algorithms I’ve talked about here. Neural networks, SMT solvers, Place & Route engines; none of these things are straightforward. The devil is in the details, and getting to know such details can be a life’s work. I don’t want to denigrate that work; just point out some shared themes."
Dataviz, Data Analysis, & Interactive
Umbria Urban Ecosystem
Sadly, this website doesn't translate very well, but I still wanted to show it because it makes an excellent use of radar/spider charts in order to visualise environmental data in a way that can be compared across multiple locations. It's really good work!
ERCOT (Texas Power Grid) by @danopia
A DataDog dashboard of power generation and usage in Texas.
Xanadu
Xanadu was the first hypertext ever designed, described in the 1960s by Ted Nelson – alternatively, it was described as "the longest-running vaporware story in the history of the computer industry". It is an interesting concept, although their claims of superiority to the World Wide Web are a bit stuck in the past. Also, I'm not entirely sure who owns their official website because their copyright statement is a littles puzzling. But here you go, enjoy.
The masked bandits come to Germany
A Datawrapper blog post by Rose Mintzer-Sweeney where we learn that Germany is... invaded by raccoons
Fast Matrix Multiplication (Animated)
Exactly what it says on the tin.
quantum of sollazzo is supported by ProofRed's excellent proofreading. If you need high-quality copy editing or proofreading, head to http://proofred.co.uk. Oh, they also make really good explainer videos.
Sponsors* casperdcl and iterative.ai Jeff Wilson Fay Simcock Naomi Penfold
[*] this is for all $5+/months Github sponsors. If you are one of those and don't appear here, please e-mail me