508: quantum of sollazzo
#508: quantum of sollazzo – 7 March 2023
The data newsletter by @puntofisso.
Hello, regular readers and welcome new ones :) This is Quantum of Sollazzo, the newsletter about all things data. I am Giuseppe Sollazzo, or @puntofisso. I’ve been sending this newsletter since 2012 to be a summary of all the articles with or about data that captured my attention over the previous week. The newsletter is and will always (well, for as long as I can keep going!) be free, but you’re welcome to become a friend via the links below.
The most clicked link last week was the brilliant “100 dataviz” project, which is aiming to publish 100 data visualizations of the same dataset.
Standing ovation to Alex for this tweet. If you get it, you get it ;-)
‘till next week,
Giuseppe @puntofisso
Become a Friend of Quantum of Sollazzo from $1/month → If you enjoy this newsletter, you can support it by becoming a GitHub Sponsor. Or you can Buy Me a Coffee. I'll send you an Open Data Rottweiler sticker. You're receiving this email because you subscribed to Quantum of Sollazzo, a weekly newsletter covering all things data, written by Giuseppe Sollazzo (@puntofisso). If you have a product or service to promote and want to support this newsletter, you can sponsor an issue. |
✨ Topical
Femicides: the undeclared war on women in Europe
“This unprecedented cross-border investigation, conducted with the participation of 18 newsrooms across Europe, attempts to shed light on femicides and rising violence against women at the time of the pandemic, as well as on the staggering shortage of up-to-date data on these phenomena.“
The Living New Deal
“This map shows New Deal public works and artworks documented by the Living New Deal. Every site is marked by a dot. Click on any dot and the panel shows what is there. For more on that site, click on Full Info.” (via Soph’s Fair Warning)
Data from satellites reveal the vast extent of fighting in Ukraine
“Scars of the war can be found far beyond the front lines.“
Look also for the work on buildings damage, visualized using Microsoft Open Buildings – an intriguing dataset: building footprints derived from satellite imagery, using AI. And what’s even better, the team at The Economist has released its code which should allow anyone to re-run the analysis. Replicable journalism is awesome.
America’s Highest Earners and Their Tax Revealed
ProPublica looks at IRS files. There’s a searchable table of the 400 highest earners (Trump is not one of them).
Work from home – a rich vs. poor issue, an urban vs. rural issue
A really well illustrated investigation by Leonardo Nicoletti and Caroline Cullinan (US and UK only). It uses the d3-hexjson plugin developed by our old friend Oli Hawkins. The link to the Observable notebook at the end is broken, but I’ve written to the authors to ask to fix it.
Get smarter every day
Every day Refind picks 5 articles that make you smarter, tailored to your interests. Loved by 100k+ curious minds.
Subscribe to get 5 links / day
🛠️📖 Tools & Tutorials
Introduction to Machine Learning
“With help from the London School of Economics and Political Science, VRT News and Texty our Journalism AI courses offer an insight on how journalists are using machine learning in their newsrooms.“
CarbonCounter
A handy tool developed at the MIT to track a vehicle’s costs and emissions, as used in this article in the New York Times.
Visualizing Neural Networks with the Grand Tour
“The Grand Tour is a classic visualization technique for high-dimensional point clouds that projects a high-dimensional dataset into two dimensions. … This visualization shows the behavior of the final 10-dimensional layer of a neural network as it is trained on the MNIST dataset. With this technique, it is possible to see interesting training behavior.“
Tomorrow’s weather
Creating a global weather map in R, by Dominic Royé.
Plot.geo | Overcoming Common ‘Gotchas’
A handy guide to rendering geographies on Observable.
Introduction to Data-Centric AI
A MIT course.
“Typical machine learning classes teach techniques to produce effective models for a given dataset. In real-world applications, data is messy and improving models is not the only way to get better performance. You can also improve the dataset itself rather than treating it as fixed. Data-Centric AI (DCAI) is an emerging science that studies techniques to improve datasets, which is often the best way to improve performance in practical ML applications. “
Course: Data Visualization Fundamentals and Best Practices
A free online course by Robert Kosara, Data Visualization Developer at Observable
🤯 Data thinking
Most Data Work Seems Fundamentally Worthless
“There is a flavor of despair I’ve become accustomed to, so deeply ingrained in the hearts of myself and my colleagues that it has settled into a hopeless passivity. It’s the despair that comes from knowing that we spend most of our time producing nothing of value.“
Is the data hype real?
Pioneer in Black Data: Monroe N. Work and the Negro Year Book
“Monroe N. Work was an African American sociologist, scholar, and researcher who spent his life collecting information and helping others to understand it. The highlight of his career, according to Work, was the nine editions of the Negro Year Book between 1912 and 1938. Each edition was an encyclopedic collection of yearly facts and data that covered many aspects of African American life as compiled by Work from data submitted from the wider community. Each subsequent edition quickly became the essential source of Black data in the United States and was reported on widely by the White and Black press and used as a resource equally in many schools in America and abroad. “
📈Dataviz, Data Analysis, & Interactive
Kaleidoscope Brain: 100 Visualizations of Moby-Dick
A PDF book of visualizations by Peter Gorman, which I found, oddly, via this interview to him on Bloomberg.
“Moby-Dick is infamous for its digressions. …His narration feels like a twistingturning struggle to explain everything. Reading Moby-Dick actually made me feel like that. …The maps I was making were obsessive and encyclopedic. They were newer and weirder and they digressed beyond straightforward geography. I’ve presented these designs (almost) in the order I made them. Instead of retelling the plot, they’re more like a sketchbook. They’re a record of my experience after reading Moby-Dick…“
Deaths of despair during the COVID pandemic
A great thread by Colin Angus summarising his latest research on drug and suicide mortality during the first two years of the pandemic. Drugs-related deaths in the US are particularly tragic.
Social Media Usage by Age
FlowingData visualizes data from the Pew Research Centre.
A year of full-scale war
“We have tracked the ebb and flow of the Russian occupation, the weapons used by the enemy to lay waste to Ukraine’s cities and energy system, the intensity of the strikes and their effect on the illumination of the capital of Ukraine in 2022.“
Some pretty good data visualizations.
🤖 AI
Crochet enthusiasts asked ChatGPT for patterns. The results are ‘cursed’
“The widely popular chatbot is churning out uncanny animal designs and we tried one for a ‘hilarious’ outcome”.
An article that unites my job and my hobby…
Keep your AI claims in check
“Advertisers should take another look at our earlier AI guidance, which focused on fairness and equity but also said, clearly, not to overpromise what your algorithm or AI-based tool can deliver. Whatever it can or can’t do, AI is important, and so are the claims you make about it. You don’t need a machine to predict what the FTC might do when those claims are unsupported.“
Among other things, they use an interesting written style for a Government website.
Large Language Models like ChatGPT say The Darnedest Things
“The Errors They Make, Why We Need to Document Them, and What We Have Decided to Do About it“
quantum of sollazzo is supported by ProofRed’s excellent proofreading. If you need high-quality copy editing or proofreading, head to http://proofred.co.uk. Oh, they also make really good explainer videos.
Supporters* casperdcl and iterative.ai Jeff Wilson Fay Simcock Naomi Penfold
[*] this is for all $5+/months Github sponsors. If you are one of those and don’t appear here, please e-mail me