418: quantum of sollazzo
#418: quantum of sollazzo – 20 April 2021
The data newsletter by @puntofisso.
Commenting on last week’s issue’s debate on how users of AI on social media platforms need to be careful about their own responsibility in shaping user behaviours, my friend Luis, a reader of this newsletter, sent me this excerpt: “Social Media evolved from an information platform to an affirmation platform”. This is a pretty good description of what social media has become, and why we need to be very careful in using AI in this context.
Speaking of context, I read this brilliant piece by Charlie Warzel on what happened to journalist Elle Hunt when she posted an admittedly half-serious opinion about films, which was tweeted in a mixture of faux-outrage and humour, was picked up by Twitter’s “trending” algorithm, and ended up as the object of (very real) abuse. Charlie describes it as “context collapse”, and this is a definition I really like as I think it applies to the very similar phenomenon that happens when data is quoted without its accompanying context – something that Twitter has a lot to answer for given its reduced text length.
In other news, I think we’ve finally reached Peak 2021:
Your links are below.
‘till next week,
Giuseppe @puntofisso
Many journalists fascinated by data journalism have struggled with entry barriers and complex tools for data analyses and visualizations. In order to lower barriers and foster data journalism, the European Data Journalism Network has been developing some tools that can make journalists' life easier.
For instance, the Stats Monitor offers ready-made dataviz and newsleads based on the latest data by Eurostat and other sources, highlighting trends and outliers. The Quote Finder comes handy if you want to explore the EU debate, including MEPs' activity on Twitter or the official statements by EU institutions.
We have also been listing data sources that can be useful for journalists, dividing them by themes.
We also have a weekly newsletter.
Topical
Swelling Anti-Asian Violence: Who Is Being Attacked Where
“Over the last year, in an unrelenting series of episodes with clear racial animus, people of Asian descent have been pushed, beaten, kicked, spit on and called slurs.”
A data-driven look by the New York Times at the growing anti-Asian racism in America.
Yelp Local Economic Impact Report: A Look at Diverse Businesses
The usual quarterly look at the Yelp Economic Average, each time offering some great insight, with a methodology explainer attached.
Myanmar’s internet suppression
“In Myanmar, the junta’s intensifying crackdowns on protesters in the street are mirrored by its rising restrictions online.”
Much appreciation to Reuters Graphics for their ability to cover intriguing angles of current stories by looking at totally unexpected data and showing how they relate to each other. Bravo.
Tools & Tutorials
OpenStreetMap: From Browser Querying to Python+R Manipulation
“A succinct guide for Data Scientists”. Simple and easy to replicate.
Data Acquisition for Beginners
Part of the “The Kit” by Exposing the Invisible, this is a great introduction to the topic of data acquisition. Ideally, read the whole kit.
Exploring other {GGPLOT2} geoms
If you use ggplot2, a popular data visualization library for the R programming language, this tutorial offers a good set of reusable examples into how to create streamgraphs, ridgeline plots, Sankey diagrams, bump charts, waffles charts, beeswarm charts, and mosaic charts.
Google Earth is now a 3D time machine
“Google puts 20 petabytes of historical satellite data into the Google Earth globe”, ArsTechnica reports.
Of course, not all of the planet has equal amounts of data and, most importantly, it isn’t entirely obvious to me if it gets updated and with what frequency (maybe it’s just me being unable to Google for it…), but if it’s got to this point, it might get better: “With Timelapse open, you’ll get a big panel on the right side with a timeline from 1984 to today, and a few shortcuts to places Google says are particularly interesting. Google Earth Timelapse doesn’t work well across the entire world just yet. Some places, like New York City, appear hopelessly blurry, even when you set the timer to 2020. Google’s highlighted locations, like Dubai, look a lot better and play out like a game of SimCity.“
I checked the Beirut Port area to see if the recent explosion could be spotted, but I didn’t manage it.
What is on the other side of the globe?
Topi Tjukanov is to be thanked for this useful QGIS query that can be used to generate a map like the one below.
Process large datasets without running out of memory
A handy set of of techniques to process larger-than-RAM datasets in Python by operating on code structure, using data management strategies, measuring memory usage, and adopting advanced techniques in Pandas and NumPy.
Writing tools I learned from The Economist
“I like the writing style of The Economist for many reasons: the most important is that it’s easy to understand their point. […] These are 6 writing tools I learned from The Economist. As you’ll see, they exist to serve, not confuse, the reader.”
Ok, this article is not about data but about writing style. Still, if you’re writing about data, you should follow the same set of rules. (Also, I didn’t know that The Economist calls itself a newspaper rather than a magazine.)
AI
Training Facial Recognition on Some New Furry Friends: Bears
“In their spare time, two Silicon Valley developers aided conservationists in developing artificial intelligence to help keep track of individual bears.”
Fascinating.
(via Juliana Antoninus)
Open Data
Land searches with French Open Data
As Tom Forth reports in this Twitter thread, this is a “a fantastic tool by Etalab showing every plot of land and property in France with its unique code, its contents, and the price of any sales since 2014”.
Thumbs up for French Open Data, and a great platform indeed by Etalab (a Government agency).
Data heroines & heroes
Episode 26: Conversation with Lisa Charlotte Rost (Datawrapper)
In this episode of Conversations with Data podcast, the European Journalism Centre’s podcast, we hear from Lisa Charlotte Rost of Datawrapper. A transcript is available here.
Visual
How the pandemic pummeled the world’s most famous shopping streets
Sad news, but the data visualization attached to this Quartz article is pretty cool.
Become a GitHub Sponsor. It costs about the price of a coffee per month, and you’ll get an Open Data Rottweiler sticker (and other stuff).
If you’re a supporter of this newsletter, thanks a lot for your support. Share this e-mail with a friend, or via social media.
quantum of sollazzo is supported by my GitHub Sponsors and Buy Me A Coffee supporters, and by ProofRed’s excellent proofreading service. If you need high-quality copy editing or proofreading, head to http://proofred.co.uk. Oh, they also make really good explainer videos.