424: quantum of sollazzo
#424: quantum of sollazzo – 1 June 2021
The data newsletter by @puntofisso.
Hello folks, quite a bit happening in my work life, including 3 fantastic new starters this week and, hopefully, our first team pic-nic. Really excited about the all-star team I’m building, including data techies, scientists, and community builders. Quite a bit of work ahead!
I found some time to again play with cyanotyping maps extracted from OpenStreetMap. Still not quite there with the results, but the hope is, at some point, to make a few prints that are good enough to go on my good old maps Etsy shop (there are still 3 road colouring maps available, btw…).
This week’s data hero interview is with FiveThirtyEight’s Anna Wiederkehr. I’ve been following her steps since she worked at the Neue Zürcher Zeitung, and she’s been thoroughly impressive over the years, so I’m really pleased she agreed to be interviewed. I hope you enjoy it!
Speaking of interviews: which people working in or with data would you like me to interview? Just hit reply and let me know. Thanks!
And last, here’s the latest edition of my ‘slow weeknotes’, if you’re curious to know what I’ve been up to in May.
‘till next week,
Giuseppe @puntofisso
Six questions to...
Anna Wiederkehr, Senior Visual Journalist at FiveThirtyEight.
What is your daily data work like and what tools do you use?
What's open on my computer at the moment is a pretty accurate representation: Sketch, Illustrator, R, Atom (code editor with some HTML/CSS/JS beep bops), Microsoft Excel, Notion and a browser open with roughly 40 tabs including Google Sheets, Google Docs and Github. I used to live in a more isolated design world, but in data journalism, there seems to be less phases where you _only_ need design tools or _only_ dev tools.
Tell me about a data project that you're proud of...
I'm still stoked on this explainer piece on wildfires in Switzerland I did when I ran the graphics team at the Neue Zürcher Zeitung, Waldbrände – das flackernde Desaster (here in English and here a Twitter thread on the process). I co-wrote with a great science journalist, Sven Titz and worked closely with two researchers from the Swiss Federal Institute for Forest, Snow and Landscape Research. I learned so much about geography, mapping and honestly, the earth, in working with and visualizing National Forest Inventory and SwissFire data. I've published arguably "greater" work depending on how you measure greatness, but successful collaborations outside of my industry are what I'm consistently more proud of than anything else.
...and a data project that someone else did and you're jealous of.
I'll never forget seeing this piece for the first time from Marco Hernandez and Pablo Robles with the egg-shaped world showing earth's surface temperature – I'm routinely in awe of the unique visual storytelling from South China Morning Post and how they manage such density in their visualization combined with an approachable style.
Otherwise I am jealous of/would love the opportunity to work on anything with Joshua Stevens from NASA, Tim Wallace from NYT, along with Mira Rojanasakul and Jeremy C.F. Lin from Bloomberg. I'm sending all of them virtual heart-eyes.
If I say "dataset", you think of...
I think "god I hope this isn't only available as a PDF."
Give someone new to data a tip or lesson you wish you'd learned earlier.
I fiercely endorse Soph Warne's tip to comment the hell out of everything. If anyone looks at any code I write, you'll see more comments than actual functioning code.
In a similar vein, versioning is critical to iterative design work. So if you have the luxury of time when experimenting with visualization on a project, save versions (i.e. different files) of your work. This is great for organization, the ability to resurface a past idea that might be better, the exploration of many concepts, and documentation. Plus you can share it so others can learn from your work too!
Data is or data are...
"Data is" but really: Meh. I'm more offended by visualise vs visualize, if you really want to have a debate :)
Topical
Most common professional marriages
This piece by FlowingData references this San Francisco Chronicle article, which, of course, finds that the stats are slightly skewed towards software developers marrying developers… you know, it’s the Bay Area.
Shifting songs of Eurovision
Yeah, Italy won, stop reminding me of that. You know, I’m a big fan of the previous Italian winner (mainly because a lifetime ago it got me a 2-page spread on a printed British newspaper).
But Eurovision inspired, as usual, quite a bit of dataviz. Among others, Reuters Graphics published this wholesome explainer (extra points if you recognise the pretty obvious outlier in the picture), while this page by EurovisionWorld.com is a mine of data about the competition.
(via Ian Chaplin)
What does global warming spell for you… and your loved ones?
Isabella Chua at Kontinentalist, a storytelling outlet based in Singapore, takes a pretty peculiar approach to analysing climate change: she shows how different generations in Asia have witnessed or are witnessing changing average temperatures.
“To put things into perspective, we plot how average temperatures will change in each generation’s lifetime. The charts begin with the earliest birth year of each generation: 1946, 1965, 1981, and 1997, for Boomers, Gen X, Millennials, and Gen Z, respectively.“
The Pandemic Has Split in Two
“Zero deaths in some cities. Thousands in others. The pandemic’s fault lines continue to widen as vaccines flow toward rich countries.“
The New York times reports on vaccine inequality (while also showing how cities in South East Asia and Australasia remained relatively unscathed with both low vaccination rates and low infection numbers, presumably because of early, strict lockdown measures).
Tools & Tutorials
Forecasting s-curves is hard
Researcher Constance Crozier writes about a topic that received a lot of attention and debate during the early stages of the pandemic: how to predict the trend in a sigmoid curve – functions that “start with exponential growth, then increase linearly, and finally level off “ – as more data becomes available. TL;DR: it’s hard. Some code is available, too.
Learn CSS
“An evergreen CSS course and reference to level up your web styling expertise.”
It benefits from its interactivity, and could be useful to some of you web dataviz folks.
Flat Data
“Flat explores how to make it easy to work with data in git and GitHub. It builds on the “git scraping” approach pioneered by Simon Willison to offer a simple pattern for bringing working datasets into your repositories and versioning them, because developing against local datasets is faster and easier than working with data over the wire.“
Another sign that GitHub is slowly moving way beyond its code repository beginnings. An effect of the Microsoft acquisition?
Dot Plots
“A dot plot visualizes a univariate (1D) distribution by showing each value as a dot and stacking dots that overlap.” Interactive code on Observable (and a comparison with Vega).
Timeline cascade maker
What it says on the tin, in a usable web version and its source code.
Dataviz & Interactive
River Runner
“Click to drop a raindrop anywhere in the contiguous United States and watch where it ends up”.
This is a truly mind-boggling piece of data visualization. The data sources used are all linked.
(via Guy Lipman)
Counting the sunny hours
Gregor Aisch, Datawrapper’s CTO, shows his attempts at a different take on creating a weather app: “Instead of a single weather icon, I want to use two columns, one for the sunshine duration and the other for rain! Who cares about a little bit of rain when there are six hours of sunshine during the day?”. Data sources are linked.
ThanAverage
“ThanAverage is a small unscientific investigation into how we value and compare ourselves to each other.“
Data thinking
Einstellung Effect: What You Already Know Can Hurt You
Ok, not quite about data, but I see it applying to data quite a bit!
“The Einstellung effect is a psychological phenomenon that changes the way we all come to solutions and impedes innovation.“
A thread on data gathering, ads, and privacy
“I’m back from a week at my mom’s house and now I’m getting ads for her toothpaste brand, the brand I’ve been putting in my mouth for a week. We never talked about this brand or googled it or anything like that.
As a privacy tech worker, let me explain why this is happening.“
AI
10 Positions Chess Engines Just Don’t Understand
“Despite the clear superiority of engines, there ARE positions which chess engines don’t (and possibly can’t) understand that are quite comprehensible for human players. Typically these positions showcase the human ability to think creatively and formulate plans and understand long-term factors in the position.“
I know just enough about chess to understand what this article is saying, and much less about automatic chess engines, but… doesn’t this issue strictly depend on the type of AI algorithm used – something that the article only acknowledges without further exploring it? I’d understand if the automatic player is using a traditional Minimax with Alpha/Beta pruning, but I’m not entirely sure why this would affect, generally speaking, a neural network algorithm.)
Become a GitHub Sponsor. It costs about the price of a coffee per month, and you’ll get an Open Data Rottweiler sticker (and other stuff). Or you can Buy Me A Coffee.
quantum of sollazzo is also supported by ProofRed’s excellent proofreading service. If you need high-quality copy editing or proofreading, head to http://proofred.co.uk. Oh, they also make really good explainer videos.