449: quantum of sollazzo
#449: quantum of sollazzo – 7 December 2021
The data newsletter by @puntofisso.
Hello, regular readers and welcome new ones :) This is Quantum of Sollazzo, the newsletter about all things data. I am Giuseppe Sollazzo, or @puntofisso. I’ve been sending this newsletter since 2012 to be a summary of all the articles with or about data that captured my attention over the previous week. The newsletter is and will always (well, for as long as I can keep going!) be free, but you’re welcome to become a friend via the links below.
Every week I include a six-question interview with an inspiring data person. This week, I speak with David Dubas-Fisher, the first sports data journalist to be featured in Quantum of Sollazzo’s Six Questions. If you read any Reach newspaper, such as the Daily Mirror, you’re likely to have seen his excellent dataviz work.
I haven’t posted about my day job for a bit. Last week, Hospital Times published a nice article, based on a conversation I had with editor David Duffy, which captures well the approach, philosophy, and values of my AI Skunkworks team.
This week, The Prepared features the story of the Bay Model, a 1:1000 scale recreation of the SF Bay area hydrology, used in the 1950s to study the impacts of hydroelectric development. Subscribe to The Prepared for stories about physical engineering and how data lives in the real world.
‘till next week,
Giuseppe @puntofisso
Six questions to...
David Dubas-Fisher
David is a Sport data journalist at the Reach Data Unit.
What is your daily data work like and what tools do you use?
I use data to produce sport stories for Reach’s national and regional titles. A lot of sport data journalists focus on things like tactics and player performance, but I don’t have access to that data, so I focus on things like rising ticket prices, the number of racist incidents at matches, or club finances. That means a lot of searching the internet and manually entering data into Google Sheets. That’s my main tool for analysing data where I try to find as many interesting lines for our various titles as possible.
Tell me about a data project that you're proud of...
The first one that springs to mind is our football finance gadget. It gives key financial information about each club in the top four divisions of English football over the last five seasons (for example, see this article). Before I started it though I had no idea how to read company accounts, and all of them are in PDF format and unsearchable. That meant I have to give myself a crash course in company accounts, and then manually search hundreds of PDFs and enter the data into a spreadsheet. The Data Unit’s designers and developers then created a gadget that can be inserted into any Reach story about football finances.
...and a data project that someone else did and you're jealous of.
I remember seeing a story in the Guardian a month or two ago about MPs taking Euro 2020 tickets from gambling firms and was kicking myself for not having done it. One of those rare times where sport reporters get to mix things up and do a bit of politics.
If I say "dataset", you think of...
A lot of work heading my way.
Give someone new to data a tip or lesson you wish you'd learned earlier.
The data isn't the story. Don't write lots of words about percentage increases and what have you, people need to know the real-world implication of what the data is telling you
Data is or data are...
"Data is". I know data is technically a plural word, but people who insist on using it that way are just Grampa Simpson yelling at a cloud.
Become a Friend of Quantum of Sollazzo → If you enjoy this newsletter, you can support it by becoming a GitHub Sponsor. Or you can Buy Me a Coffee. I'll send you an Open Data Rottweiler sticker. You're receiving this email because you subscribed to Quantum of Sollazzo, a weekly newsletter covering all things data, written by Giuseppe Sollazzo (@puntofisso). If you have a product or service to promote and want to support this newsletter, you can sponsor an issue. |
Topical
The Winners and Losers From a Year of Ranking Covid Resilience
As this article‘s subheading rightly notes, “past performance is no guarantee of future success – or failure”. The team at Bloomberg has been working for over a year on their Covid Resilience Ranking, which tracks how good or bad different places have been doing during the pandemic.
Populism on Twitter might be a problem for the Labour Party
Another intriguing analysis by Italian data journalist Francesco Piccinelli, covering how populism is expressed in tweets by British MPs. It uses a custom definition of “populism” by the author, based on the NRC algorithm. All the R code used to conduct the analysis and generate the chart is available.
Just like modern humans, honeybees avoid each other amid plagues
“They segregate behaviours in different parts of their hives to prevent parasites from spreading”, the Economist shows us.
Tools & Tutorials
Lorem Faces
Dummy faces of non-existing people, generated by a photorealistic AI. The licensing terms are a bit too complex, though.
Adding A Dyslexia-Friendly Mode To A Website
A few practical CSS steps can really help your website become more accessible.
Google Journalist Studio
This collection of tools in one place might be handy for reporters.
Financial market data analysis with pandas
“Since a lot of financial data is available, much of it free, it makes a great playground to learn more about analyzing and working with time series data.“
Data thinking
People mistake the internet’s knowledge for their own
Another academic paper this week.
Grim abstract excerpt: “People frequently search the internet for information. Eight experiments (n = 1,917) provide evidence that when people Google for online information, they fail to accurately distinguish between knowledge stored internally—in their own memories—and knowledge stored externally—on the internet. […] As a result, people may lose sight of where their own knowledge ends and where the internet’s knowledge begins. “
42 things I learned from building a production database
Thoughts by researcher Mahesh Balakrishnan, who worked at Facebook as an engineer.
Dataviz & Interactive
The meanings of life
What gives life meaning? Rose Mintzer-Sweeney at Datawrapper explores this question through data from Pew, which surveyed 18,850 people “on what makes their own lives meaningful, fulfilling, or satisfying”.
Spatial Analysis of Dog Ownership and Car Use in the UK
I love academics for coming up with such ideas.
“Walking the dog, and other dog-related practices, have been suggested to be particularly car-dependent. Secondary data analysis presented finds associations between the high energy use practices of car travel and dog ownership. There is a strong association between the rate of dog ownership and car km travelled per person. This relationship holds when controlling for income, level of urbanisation housing type and demographic variables.“
Most Common Daily Routines
One of those very pretty data visualizations by FlowingData, showing the most common uses of time, throughout the day, in the US. Some pretty cool insight comes from the chart showing how schedules become more complex as the number of children in a family goes up.
What is code?
A highly visual long essay by Paul Ford for Bloomberg.
Snake Oil Supplements?
“Scientific evidence for popular health supps”, visualized by David McCandless and Tom Evans for Information is Beautiful, with research by Six Questions graduate Miriam Quick.
(via Lisa Riemers)
The fry universe
Just a 3D visualization of… fries/crisps.
Emoji to Scale
Yes, you can give each emoji its real-life size, and…
quantum of sollazzo is also supported by ProofRed’s excellent proofreading service. If you need high-quality copy editing or proofreading, head to http://proofred.co.uk. Oh, they also make really good explainer videos.