433: quantum of sollazzo
#433: quantum of sollazzo – 3 August 2021
The data newsletter by @puntofisso.
Hello, regular readers and welcome new ones :) This is Quantum of Sollazzo, the newsletter about all things data. How have you been? My week has been pretty interesting as a few work projects went live. If you want to learn more about them, I’ve just published the latest instance of my slow weeknotes (published on a monthly basis), which includes a summary of the current and future AI Skunkworks projects.
This week’s “Six questions” interviewee, is Ahmad Barclay, who is currently Census Data Vis Lead at the UK Office for National Statistics, but whose data visualization and mapping work has been widely praised. He was also a recent presenter at Geomob, still my favourite meetup with over 10 years of attendance.
Don’t forget that I’m still selling fine-art, printed-to-order, quality giclée map prints through my Etsy print shop. There is a general 10% discount for newsletter readers via this link or by using coupon code “NEWSLETTER” at checkout. Ask me to add a location, and you’ll get 20% off :-)
‘till next week,
Giuseppe @puntofisso
Six questions to...
Ahmad Barclay
Ahmad is Census Data Vis Lead, Office for National Statistics.
What is your daily data work like and what tools do you use?
Building interactive data visualisations as much as I can when I can find the time in the day. Using a lot of Svelte, a Javascript framework which is great for rapid prototyping. The rest of the time I’ll be managing my team, and attending planning and coordination meetings with people from across the ONS involved in Census 2021, from data, to analysis, to dissemination.
Tell me about a data project that you're proud of...
Palestine Open Maps. This project was my gateway drug into web mapping and front-end development. We stitched together 200+ historic maps and made them navigable, searchable and downloadable.
...and a data project that someone else did and you're jealous of.
Morphocode Explorer. A tool that allows you to explore neighbourhood level data in maps and charts. I love the choice of data, how it’s visualised, and just how goddam smooth the whole user experience is.
If I say "dataset", you think of...
Spreadsheets. It doesn’t matter how many ways I manipulate and visualise data, I can’t break that datasets = 2-dimensional tables link in my mind.
Give someone new to data a tip or lesson you wish you'd learned earlier.
D3 doesn’t have all the answers. Don’t be ashamed of using tools like Excel and Illustrator to create static data visualisations. They’re often the fastest and most effective tools available, and Illustrator gives you complete control over the look and feel of the final visual.
Data is or data are...
Is… but that’s probably not what it says in the ONS style guide.
Become a Friend of Quantum of Sollazzo → If you enjoy this newsletter, you can support it by becoming a GitHub Sponsor. Or you can Buy Me a Coffee. I'll send you an Open Data Rottweiler sticker. You receive this email because you subscribed to Quantum of Sollazzo, a weekly newsletter covering all things data, written by Giuseppe Sollazzo (@puntofisso). If you have a product or service to promote and want to support this newsletter, you can sponsor an issue. |
Topical
Tokyo Summer Olympics Medal Count
“Bloomberg News is tracking the results of the Summer Olympics in Tokyo, including a schedule of events and the number of medals won by each country or delegation. “
Which Countries Are Doing Better — Or Worse — Than Expected At The Tokyo Olympics?
Just for a counterpoint to the previous article, this is “an updating medal count for every competing nation compared with the number of medals we thought each would have won so far — along with how many more they might take home”. By Quantum of Sollazzo interviewee Anna Wiederkehr and her colleagues at FiveThirtyEight.
Vaccination burnout?
A look from Reuters Graphics at how the Delta variant spread is pushing countries to speed up their vaccinations plans, and some of the problems with vaccination rates getting to their plateau.
Two-thirds of Southern Republicans want to secede
“A new YouGov survey conducted on behalf of a democracy watchdog group finds that 66 percent of Republicans living in the South say they’d support seceding from the United States to join a union with other Southern states.“
Erm, what…?!
NHS Digital Data Uses Register
Interesting development at NHS Digital, who have just launched a Data User Register, published on a monthly basis, which is basically a list of all their Data Sharing Agreements.
New Data Leads To Rethinking (Once More) Where The Pandemic Actually Began
Conspiracy? No conspiracy? NPR has taken and translated into human-readable language an academic paper that looks at the origins of COVID-19: ““I do think transmission from another species, without a lab escape, is the most likely scenario by a long shot,” says evolutionary biologist Michael Worobey at the University of Arizona.“
Who will succeed Angela Merkel?
“Our poll tracker shows who might be next into the chancellery”.
Once again, excellent data visualization work by The Economist.
Tools & Tutorials
Bicycle Network Analysis
By People For Bikes, a cycling advocacy organisation, this analysis allows you to “find out how well the bike network in your community connects people with the places they want to go”. London scores 58 (Amsterdam, for comparison, is 86).
Machine-learning on dirty data in Python: a tutorial
“Often in data science, machine-learning applications spend a significant energy preparing, tidying, and cleaning the data before the machine learning.
Here we give a set of Python tutorials on how some of these operations can be simplified with adequate machine-learning tools.“
This is in fact a series of hands-on tutorials. It still seems (mostly) under construction, but there’s some good material in it already.
Towards Inserting One Billion Rows in SQLite Under A Minute
This article is slightly on the techie side, but it might provide you with some very useful tips to speed up your inserts, which may be handy if you are trying to ingest a lot of data before an analysis.
“Current Best: 100M rows inserts in 33 seconds. (you can check the source code on Github)“
Shapecatcher – Unicode Character Recognition
“Draw something in the box! And let shapecatcher help you to find the most similar unicode characters!“
This could be more useful than you initially think. My drawing is stuck at Year 1.
The Big Mac index
As you might know, since 1986 The Economist has been calculating and releasing a MacDonalds-based tool that captures exchange-rate theory. In recent years they have created an interactive currency comparison tool and the index itself has gone from 15 currencies to the current 57. The updates are explained in a recent Off the Charts newsletter issue.
30 Days of ML
“Machine learning beginner → Kaggle competitor in 30 days. Non-coders welcome.“
100 Days D3 Dataviz
“As I too often get lost in seeing all the great work of others without me doing anything, I decided to dedicate my learning D3.js\/Observable to a 100 days challenge. So far the results is this D3 DataViz learning collection. My objective is to scale it to a 100 days ‘course’.“
Postcodes.io
“Postcodes.io is an open sourced project maintained by Ideal Postcodes.
It is a free resource, allowing developers to search, reverse geocode and extract UK postcode and associated data.“
(h/t Oli Hawkins)
Data thinking
What is the right level of specialization? For data teams and anyone else.
“I think this specialization of data teams into 99 different roles (data scientist, data engineer, analytics engineer, ML engineer etc) is generally a bad thing driven by the fact that tools are bad and too hard to use”, Erik Bernhardsson argues (sort of).
Why Journalists Need An Archiving System
“Taking good care of data requires some time and money, but the loss of irreplaceable reporting work can come at a higher cost.“
An interesting look at best practices in (offline and online) data management for journalists, by Talya Cooper for the Global Investigative Journalism Network.
Dataviz & Interactive
In defense of simple charts
“Simple visualization types don’t need to be boring”, argues (future Quantum of Sollazzo interviewee) Lisa Charlotte Rost of Datawrapper.
(via Lucilla Piccari)
Why we’re blind to the color blue
“We’ll explore why our eyes are unable to focus on the color blue, and how we see with our brain as much as with our eyes”. Well, this article is not quite dataviz, but you’ve got to see it – the NASA example is pretty phenomenal.
AI
Not much in this space this week. But if you read my weeknotes you’ll find some reflections on the topic.
quantum of sollazzo is supported by ProofRed’s excellent proofreading service. If you need high-quality copy editing or proofreading, head to http://proofred.co.uk. Oh, they also make really good explainer videos.