432: quantum of sollazzo
#432: quantum of sollazzo – 27 July 2021
The data newsletter by @puntofisso.
Hello, regular readers and welcome new ones :) This is Quantum of Sollazzo, the newsletter about all things data. This has been a relatively quiet week, and I haven’t found that much data stuff around, especially in data journalism, so this is a slightly less visual issue than usual. I bet all writers and data wranglers are taking some well-deserved holidays (while developer-thinkers clearly are not).
This week’s “Six questions” interviewee, is Gurman Bhatia, an independent journalist and information designer whose steps I’ve been following since a few years back when she worked at the Hindustan Times (one of India’s most popular newspapers) and then at Reuters Graphics.
‘till next week,
Giuseppe @puntofisso
Six questions to...
Gurman Bhatia
Gurman is an Independent journalist and information designer.
What is your daily data work like and what tools do you use?
Acquiring, cleaning/analysing and visualising data for the news. Google sheets for creating datasets, VSCode as a text editor. Github for version control, sharing work. Notion for research and notetaking. A whiteboard, pen and paper cause the mind moves in many directions. Node.js for scripted data extraction/analysis. R for scripted analysis/static viz/prototyping. Illustrator for finishing up. d3.js for bespoke and interactive visualisations.
Common workflows for viz:
- for static viz - d3.js/R/Datawrapper/Rawgraphs → Illustrator → ai2html
- for interactive viz - html/css/js with d3.js
Tell me about a data project that you're proud of...
COVID-19 vaccine rollout: charts, maps and eligibility by country.
There were tons of dashboards around COVID across the world. However, this offered more context. Often trying to tell a story amidst a space where people are mostly staring at a wall of numbers. It was a team effort, but I am particularly proud of how the UX for top globe turned out + vaccine equity charts in a dashboard!
...and a data project that someone else did and you're jealous of.
These simulations show how to flatten the coronavirus growth curve
In the news industry we often think about impact - something that is usually hard to measure. You want something to improve in society because of your work. And Harry's piece did exactly that. It was at a time when we needed to understand the need for social distancing. And even if a handful of people stayed at home because of this piece (I assume it was actually more), that is the BEST kind of impact. Apart from being the most read article in the history of Washington Post, they also translated it to multiple other languages.
If I say "dataset", you think of...
jsons and csvs
Give someone new to data a tip or lesson you wish you'd learned earlier.
Your dream dataset does not exist.
Data is or data are...
I believe... data is plural (obviously inspired from JS Vine's awesome newsletter). 🙈
Become a Friend of Quantum of Sollazzo → If you enjoy this newsletter, you can support it by becoming a GitHub Sponsor. Or you can Buy Me a Coffee. I'll send you an Open Data Rottweiler sticker. You receive this email because you subscribed to Quantum of Sollazzo, a weekly newsletter covering all things data, written by Giuseppe Sollazzo (@puntofisso). If you have a product or service to promote and want to support this newsletter, you can sponsor an issue. |
Topical
How the UK’s Covid-19 vaccine rollout has dramatically reduced deaths
“Deaths have plummeted from 400 at the same point during the second wave to just eight.“
The chart in this article on the New Statesman captures it brilliantly.
Which Olympic Event is for You? 🥇
From the good folks at Count, this is a good look at data about the Olympics, with several interactive visualizations.
CQC Death Notifications from Care Homes
The Care Quality Commission has published data on care home residents deaths. It’s rather sobering.
The only good COVID (related) wave
An excellent animated visualization based on ONS data.
Air Pollution in Lambeth and Southwark
A vulnerability and exposure dashboard.
Tools & Tutorials
Machine learning in a hurry: what I’ve learned from the SLICED ML competition
“In this post, I’ll share some of what I’ve learned about using R, and the tidymodels collection of packages, for competitive Kaggle modeling.“
An interesting report from the SLICED competition.
Data thinking
For SQL
If you’ve read last week’s “Against SQL” article, this is a good response to it.
“I don’t doubt that there is a world where
churn[['State','Score']].groupby('State').mean().sort_values(by='Score', ascending=False)
is more useful than
SELECT state, AVG(score) FROM churn GROUP BY state order by score;
But there are many worlds where the latter is more than just fine.“
Analytics is at a crossroads
“The world is full of great analysts. Will we have the courage to go looking for them?”
This is a response to the Against/For SQL debate.
Dataviz & Interactive
What percentage of the world is asleep or awake as the day progresses
“Assuming a solid 8 hours at night!”.
It’s an animated image, and a very interesting one at that. It had never occurred to me that there are times of the day when literally 90% of humans are probably awake.
Growing Urban Bicycle Networks
“For 62 cities we study different variations of growing a synthetic bicycle network between an arbitrary set of points routed on the urban street network.“
A research paper on the topological limits of cycling lane network development.
Source code is available here and here’s an example with Rome.
It’s all based on OpenStreetMap.
Singapore top shipping centre for eighth year running
“Top global shipping hubs and short-term bulk, oil and container outlook revealed in recently published Baltic-Xinhua report” Interesting article on the Marine Traffic blog, which includes a few pretty cool maps.
The topologist’s map of the world
What happens if you try and create a chart of all countries’ border relations (and not even completely, as this excludes exclaves and enclaves). A few more regional/historical maps are also offered.
(via Davide Tassinari)
AI
The ethics of a deepfake Anthony Bourdain voice
“The new documentary 'Roadrunner' uses A.I.-generated audio without disclosing it to viewers. How should we feel about that?“
Helen Rosner explores this question for the New Yorker.
https://twitter.com/martinstabe/status/1416658472020713475?s=03
To retrain, or not to retrain? Let’s get analytical about ML model updates
“Is it time to retrain your machine learning model?“.
A long, useful tutorial.
quantum of sollazzo is supported by ProofRed’s excellent proofreading service. If you need high-quality copy editing or proofreading, head to http://proofred.co.uk. Oh, they also make really good explainer videos.