509: quantum of sollazzo
#509: quantum of sollazzo – 14 March 2023
The data newsletter by @puntofisso.
Hello, regular readers and welcome new ones :) This is Quantum of Sollazzo, the newsletter about all things data. I am Giuseppe Sollazzo, or @puntofisso. I've been sending this newsletter since 2012 to be a summary of all the articles with or about data that captured my attention over the previous week. The newsletter is and will always (well, for as long as I can keep going!) be free, but you're welcome to become a friend via the links below.
As of this month, you can buy advertisement space in Quantum of Sollazzo directly. Just head to the cal.com scheduler and select an issue :)
Usual disclaimer: I reserve the right to decline and refund ads that are not suitable to this community etc etc. For everything else, the Sponsor Kit can be found on the newsletter home page.
I want to keep the newsletter free for as long as I can, but with the readership growing, so are the costs involved in sending it. Most of it is covered by Friends of the newsletter (link below) and occasional sponsors for now, but it would be good to have a pipeline to make sure I don't have to suddenly stop if things get out of control (or maybe this is the dream...).
The most clicked link last week was that hilarious tweet by Alex Selby-Boothroyd. Well done Alex, if you're reading this there's a drink on me for you.
Speaking of quotable content, Tim Harford wrote about ChatGPT in a very enjoyable article, which includes this brilliant line: "What ChatGPT deals in is not truth; it is plausibility".
And last, let's have fun!
I spotted a Raspberri Pi Zero W become available on the Pi Hut and snatched it. The Zero W is the low-memory model which offers WiFi connection. I'm thinking of using it to remote monitor the allotment (wishful thinking) but your suggestions on what to create and share would be welcome :)
'till next week,
Giuseppe @puntofisso
Become a Friend of Quantum of Sollazzo from $1/month → If you enjoy this newsletter, you can support it by becoming a GitHub Sponsor. Or you can Buy Me a Coffee. I'll send you an Open Data Rottweiler sticker. You're receiving this email because you subscribed to Quantum of Sollazzo, a weekly newsletter covering all things data, written by Giuseppe Sollazzo (@puntofisso). If you have a product or service to promote and want to support this newsletter, you can sponsor an issue. |
✨ Topical
Inside the Suspicion Machine
"Obscure government algorithms are making life-changing decisions about millions of people around the world. Here, for the first time, we reveal how one of these systems works."
Interestingly, this story on Wired.Com comes with a full description of its methodology, written by its authors.
Ukrainians live in a state of alert as the battle for the skies rages above them
"Air supremacy is being contested from the ground, while civilians face a daily game of risk assessment as thousands of air-raid sirens blare across the country. Which ones do they need to listen to?"
Paywalled, but there's a lot of very nice visualizations in this article, some of which you'll see on this Twitter thread by journalist Venetia Menzies.
(via Chris Weston)
Mapping Diversity
"OBC Transeuropa, together with 8 other EDJNet partners, looked, into the names of 145,933 streets in 30 major cities across 17 European countries."
As usual, they share all data, including comparative aggregate data, a list of all the streets named after one or more women (csv version), and a list of all the women giving their name to one or more streets (csv version).
Drought in California
"Short storms, however strong, are not enough to end California’s drought."
A very good analytical approach here by Reuters.
Get smarter every day
Every day Refind picks 5 articles that make you smarter, tailored to your interests. Loved by 100k+ curious minds.
Subscribe to get 5 links / day
🛠️📖 Tools & Tutorials
Tufte CSS
"Tufte CSS provides tools to style web articles using the ideas demonstrated by Edward Tufte’s books and handouts. "
13 SQL Statements for 90% of Your Data Science Tasks
Just the very basics of SQL. Worth a refresh, every now and then.
Online gradient descent written in SQL
At the other end of the spectrum from the previous link...
"In this post, I want to push this idea, and actually implement a machine learning algorithm within a relational database, using SQL."
5 Changepoint Detection algorithms every Data Scientist should know
"Essential guide to changepoint detection algorithms for time series analysis."
I had never encountered "changepoint" as a concept, but it's basically the point in which a time series somehow changes, e.g. some signal goes from positive to negative, or there is a change of magnitude, etc.
Help constituents explore budgets
Not a tool per se, but the latest issue of the civictech.guide newsletter has a few links to tools used to monitor (mostly government) budgets.
Introduction to PWAs (Progressive Web Apps)
A super short starter which then continues into a Glitch webapp.
NICAR 2023 tipsheets
A collection of speakers tipsheets and slides from NICAR, the investigative journalism conference.
graph-tool
"Graph-tool is an efficient Python module for manipulation and statistical analysis of graphs (a.k.a. networks). Contrary to most other Python modules with similar functionality, the core data structures and algorithms are implemented in C++, making extensive use of template metaprogramming, based heavily on the Boost Graph Library. This confers it a level of performance that is comparable (both in memory usage and computation time) to that of a pure C/C++ library."
Mathesar
"Mathesar is a straightforward open source tool that provides a spreadsheet-like interface to a PostgreSQL database."
📈Dataviz, Data Analysis, & Interactive
The end of range anxiety: how has the range of electric cars changed over time?
A good look at EV ranges by Hannah Ritchie, which includes the splendid data viz below.
Predicting wine quality using chemical properties
"Can we predict the quality/taste of a wine knowing its chemical properties?"
Also: link to the jupyter notebook used for the analysis.
10,000 Tremors
Reuters uses data from the US Geological Survey to visualize the Turkey earthquake.
🤖 AI
Ask Paper
A fantastic academic paper summarizer created by the Hippo AI foundation. I've tested it with my thesis paper and it works brilliantly.
Census GPT
US Census queries using ChatGPT. Will the answers be correct, or just plausible? Source code is available on github.
(via Massimo Conte)
The Waluigi Effect (mega-post)
An explanation of the brilliantly named Waluigi Effect, that is thus defined: "After you train an LLM to satisfy a desirable property P, then it's easier to elicit the chatbot into satisfying the exact opposite of property P".
Basically, the fact that language models are increasingly apt to bullshittings the better they become at giving credible answers.
"The better the model, the more likely it is to repeat common misconceptions."
quantum of sollazzo is supported by ProofRed's excellent proofreading. If you need high-quality copy editing or proofreading, head to http://proofred.co.uk. Oh, they also make really good explainer videos.
Supporters* casperdcl and iterative.ai Jeff Wilson Fay Simcock Naomi Penfold
[*] this is for all $5+/months Github sponsors. If you are one of those and don't appear here, please e-mail me