quantum of sollazzo

Subscribe
Archives
Sponsor
August 2, 2022

479: quantum of sollazzo

#479: quantum of sollazzo – 2 August 2022

The data newsletter by @puntofisso.


Hello, regular readers and welcome new ones :) This is Quantum of Sollazzo, the newsletter about all things data. I am Giuseppe Sollazzo, or @puntofisso. I've been sending this newsletter since 2012 to be a summary of all the articles with or about data that captured my attention over the previous week. The newsletter is and will always (well, for as long as I can keep going!) be free, but you're welcome to become a friend via the links below.

·

Every week I include a six-question interview with an inspiring data person. This week, I speak with Riccardo Saporiti, a freelance data journalist who's worked for Wired Italia and InfoData, the data initiative of Italian newspaper "Il Sole 24 Ore".

·

The most clicked link last week was the AI-driven clean up photo tool. Indeed, a few of you have written to say how useful it might be.

'till next week,
Giuseppe @puntofisso


Become a Friend of Quantum of Sollazzo →

If you enjoy this newsletter, you can support it by becoming a GitHub Sponsor. Or you can Buy Me a Coffee. I'll send you an Open Data Rottweiler sticker.
Quantum of Sollazzo will always be free.

You're receiving this email because you subscribed to Quantum of Sollazzo, a weekly newsletter covering all things data, written by Giuseppe Sollazzo (@puntofisso). If you have a product or service to promote and want to support this newsletter, you can sponsor an issue.


Six questions to...

Riccardo Saporiti

Riccardo is a freelance datajournalist and producer, contributor at Wired.it, InfoData24-Il Sole24Ore.

Photo Credits Umberto Costamagna
What is your daily data work like and what tools do you use?
I wish I had a routine (nah, just kidding). Sometimes I deal with datasets released by statistical offices, say Eurostat, sometimes it’s news I try to break with data, sometimes it’s my editors or me coming up with ideas, sometimes I cross different datasets to see if anything pops out. I do data prepping with LibreOffice or Google Sheets, I don’t deal with big data, and I visualize them with Tableau public.

Tell me about a data project that you're proud of...
I pazienti dimenticati (Forgotten patients). An investigation conducted by filing more than 200 Foia requests to local Nhs in Italy to gather informations about surgeries, exams, visits and cancer screenings postponed during the first wave of the pandemics. Turns out we had parts of the country with low pandemic incidence and high percentages of postponed healthcare services.

...and a data project that someone else did and you're jealous of.
It’s called Mai dati, it’s an investigation on abortion in Italy. More specifically, it’s an investigation on doctors and nurses refusing to perform abortion, as the law allows them in Italy. It’s a work by Chiara Lalli and Sonia Montegiove. They made no dataviz, but they wrote a book about it.

If I say "dataset", you think of...
Something I need to visualize. Or maybe cross with another one. Vlookup is my friend.

Give someone new to data a tip or lesson you wish you'd learned earlier.
Always cross different datasets, look for correlations that might expand your knowledge. And keep your dataviz as simple to understand as possible.

Data is or data are...
Since it’s Latin, data are.

Topical

Voici qui vit dans les pires îlots de chaleur de votre ville

Or, as translated, "Here's who lives in your city's worst heat islands".
Original French and automatic English translation showing how the socio-economic divide in Canada is deepened by climate change.

Voici qui vit dans les pires îlots de chaleur de votre ville.png

TikTok story

"It seems like many of the emerging musicians to “make it big” are coming from TikTok. Here’s data to show what I mean."
Another engaging data-driven story by The Pudding.

TikTok story.png

Europe’s record summer of heat and fires – visualised

The Guardian graphics team visualised record temperatures and fires in Europe.

Europes record .png

London hit 104 degrees

"That’s like 111 degrees in Boston."
Interesting approach by the Washington Post, using similarities to illustrate the changing climate.
El Pais did something similar (here's an automated translation into English), using data from a study published in PLOS ONE.

London hit 104 degrees.png

Black Districts Gutted as Suburban Flight Reshapes Congress Maps

"There are 22 majority-Black districts in the current Congress. Next year, there will be as few as nine."

Black Districts Gutted as Suburban Flight Reshapes Congress Maps.png

La tierra seca, sin agua

A brilliant visualization of climate change in Mexico, originally in Spanish and here automatically translated into English: "The dry land, without water".
(h/t DataWrapper's DataVisDispatch)

La tierra seca, sin agua.png

Tools & Tutorials

OpenBBTerminal

This library provides an open source framework that looks like a Bloomberg terminal. This Twitter thread explains what it's for. They also have a website. As someone noted, though, it would be great to have the data, as opposed to a UI replica...

OpenBB.png

How to choose an interpolation for your color scale

Six Questions graduate Lisa Charlotte Muth of Datawrapper explains how she chooses the right colour interpolation, i.e. the way different data classes are smoothed according to a colour scale.

How to choose an interpolation for your color scale.png

Choosing the right map type for your data

"Looking to represent your geographical data on a map? This post will help you pick the most effective map type for your data."
An article by Flourish on whether or not you should choose a map, and if so, which map.

Choosing the right map type for your data.png

3 Word2Vec articles

These three articles look at different aspects of Word2Vec, and seem helpful to really understand how it works: What’s Behind Word2vec
Words into Vectors
The Word2vec Hyperparameters
They're part of a larger series, linked in the articles.

Finding errors in datasets with Similarity Search

Coming with a handy demo, this article illustrates how distance-based methods can be used to find errors in categorized datasets.

Finding errors in datasets with Similarity Search.png

Machine Learning Algorithms with Python

Yes, all of them implemented and explained.

Pretty Maps in Python

Pretty good tutorial.
"The project only contains 425 lines of Python due to intense 3rd-party package use.
The map data is collected from OpenStreetMap using the OSMnx library. This library itself is made up of only 3,700 lines of Python as its relying on NetworkX. NetworkX wraps up functionality relating to complex networks and is made up of 78K lines of Python. NetworkX is largely the work of Aric Hagberg, an applied mathematician at the Los Alamos National Laboratory as well as Jarrod Millman who was once the release manager for both NumPy and SciPy.
Rendering is handled by vsketch, a project made up of 4.6K lines of Python and is based largely on the efforts of Antoine Beyeler, an entrepreneur based in Switzerland.
In this post, I'll walk through generating the following rendering of Tallinn's Old Town.
"
There's also a website allowing you to generate the maps (but don't abuse it, please).

Pretty Maps in Python.png


Sponsored content

ad.png We found the best JavaScript newsletter.
Bytes is probably the funniest web dev newsletter you'll ever read (trust me). If you like our newsletter, I've got a feeling you'll love Bytes too. There's a reason 100k developers read it every week.


Data thinking

The Only Algorithm for Hard Problems: Shake and Pull Gently

"(Or, “regularized greedy algorithms and their applications.”)"
Which prompts me to ask: is "switch if off and on again" a greedy algorithm?
"It occurs to me that there’s something I oughtta admit. It actually took me a bunch of tries to get that shot of the headphones coming untied. Most of the times I tried it didn’t work, or only worked partially, leaving me with some knots to undo by hand. Even if I can describe the technique in a simple way, actually applying it takes practice and skill. That applies to all the algorithms I’ve talked about here. Neural networks, SMT solvers, Place & Route engines; none of these things are straightforward. The devil is in the details, and getting to know such details can be a life’s work. I don’t want to denigrate that work; just point out some shared themes."

The only algorithm.png

Dataviz, Data Analysis, & Interactive

Umbria Urban Ecosystem

Sadly, this website doesn't translate very well, but I still wanted to show it because it makes an excellent use of radar/spider charts in order to visualise environmental data in a way that can be compared across multiple locations. It's really good work!

Umbria.png

ERCOT (Texas Power Grid) by @danopia

A DataDog dashboard of power generation and usage in Texas.

ERCOT.png

Xanadu

Xanadu was the first hypertext ever designed, described in the 1960s by Ted Nelson – alternatively, it was described as "the longest-running vaporware story in the history of the computer industry". It is an interesting concept, although their claims of superiority to the World Wide Web are a bit stuck in the past. Also, I'm not entirely sure who owns their official website because their copyright statement is a littles puzzling. But here you go, enjoy.

Xanadu.png

The masked bandits come to Germany

A Datawrapper blog post by Rose Mintzer-Sweeney where we learn that Germany is... invaded by raccoons

The masked bandits come to Germany.png

Fast Matrix Multiplication (Animated)

Exactly what it says on the tin.

Fast Matrix Multiplication.png


quantum of sollazzo is supported by ProofRed's excellent proofreading. If you need high-quality copy editing or proofreading, head to http://proofred.co.uk. Oh, they also make really good explainer videos.

Sponsors* casperdcl and iterative.ai Jeff Wilson Fay Simcock Naomi Penfold

[*] this is for all $5+/months Github sponsors. If you are one of those and don't appear here, please e-mail me

Don't miss what's next. Subscribe to quantum of sollazzo:
Powered by Buttondown, the easiest way to start and grow your newsletter.