621: quantum of sollazzo
#621: quantum of sollazzo – 26 August 2025
The data newsletter by @puntofisso.

Hello, regular readers and welcome new ones :) This is Quantum of Sollazzo, the newsletter about all things data. I am Giuseppe Sollazzo, or @puntofisso. I've been sending this newsletter since 2012 to be a summary of all the articles with or about data that captured my attention over the previous week. The newsletter is and will always (well, for as long as I can keep going!) be free, but you're welcome to become a friend via the links below.
The most clicked link last week was Axios' look at journalist deaths in the Israel-Gaza war.
I've built a thing: Better Word Counter
Please meet Better-Word-Counter.com.
It's brought to you by me and vibe coding – it's a privacy-conscious word counter with a little extra twist: the ability to count words and characters of text split in multiple sections, factoring in titles if you like, and exporting it all to a few document formats.
If it feels a bit... niche, it totally is: a few weeks ago, I was writing my application to the British Computer Society CITP certification, which required a text of 440 words, split over 4 sections, where the first two had to amount to about 30% of the total. I realised that my usual go-to tools – wordcounter.net, Google Docs, and MS Word – didn't allow me to do that in a straightforward way, so I ended up scripting a bit to get it done.
So when I wanted a simple coding project to test Claude Code, I had a "ha-ha" moment. Maybe not a world-changing tool, but please do try it and let me know what you think of it or if you have any feature requests.
AMA – Ask Me Anything! Submit a question via this anonymous Google form. I'll select a few every 4-5 weeks and answer them on here :-) Don't be shy!
The Quantum of Sollazzo grove now has 31 trees. It helps managing this newsletter's carbon footprint. Check it out at Trees for Life.
'till next week,
Giuseppe @puntofisso.bsky.social
🛎️ Things that caught my attention
This is a pretty good read Damn The Race to the Bottom: AI Shouldn't Be Cost-Cutting Machine – Colin Davis is one of my colleagues on Goldsmith's University Computing External Advisory Board, and his ideas are refreshingly thought-provoking: "Being asked to cut 20–30% of creative costs and labour under the banner of ‘AI’ is unimaginative at best, and corrosive at worst", he says. It's obviously not just the creative industry that suffers from it.
You may know how much I love cheese (and that I'm an occasional home cheesemaker). Journalist Leo Benedictus has an interesting look at cheese data in his newsletter, that includes the very good chart below – if you read the whole article, you'll also get the 3D version that shows water, protein, and fat.
✨ Topical
Mole or cancer? The algorithm that gets one in three melanomas wrong and erases patients with dark skin
"The Basque Country is implementing Quantus Skin in its health clinics after an investment of 1.6 million euros. Specialists criticise the artificial intelligence developed by the Asisa subsidiary due to its "poor” and “dangerous" results. The algorithm has been trained only with data from white patients."
This is a brilliant article, with one of those "what could possible go wrong?" moments.
The shocking thing is that it should be easy to test for such issues, and I have a personal story that relates to this.
When I directed the NHS AI Skunkworks, we stopped at discovery for a dermatology project because it became evident that the input data was biased – all it took was to check it against the Fitzpatrick Scale, which is the official measure of skin shade used by dermatologists. How did no one spot this before it went into production?
Score Electoral District Maps
"Most of our federal and state legislators are elected from districts. Every ten years, state governments redraw district boundaries in a process known as redistricting. PlanScore promotes fairness in the redistricting process. We make it easy for policymakers and advocates to score new district maps and assess whether they’re fair or gerrymandered. We also provide access to the most comprehensive historical dataset of partisan gerrymandering ever assembled."
How the audiences of 30 major news sources differ in their levels of education
The Pew Research Centre finds that "American audiences of 30 prominent news sources vary dramatically in their levels of education".
Brain food, delivered daily
Every day we analyze thousands of articles and send you only the best, tailored to your interests. Loved by 505,869 curious minds. Subscribe.
🛠️📖 Tools & Tutorials
An Interactive Guide to SVG Paths
Josh Comeau has written this follow-up guide from his previous "A Friendly Introduction to SVG", featured in Quantum 617.
Presidio
Presidio is a Microsoft "open-source framework for detecting, redacting, masking, and anonymizing sensitive data (PII) across text, images, and structured data. Supports NLP, pattern matching, and customizable pipelines."
Yellow, Purple, and the Myth of “Accessibility Limits Color Palettes”
A tutorial that includes 6 reusable WCAG-tested palettes.
Apache ECharts
Oddly, I hadn't heard of Apache ECharts before. It's Apache's own "Open Source JavaScript Visualization Library". Documentation and examples are good.
APIs don't make good MCP Tools
Reilly Wood: "The Model Context Protocol (MCP) is a pretty big deal these days. It’s become the de facto standard for giving LLMs access to tools that someone else wrote, which, of course, turns them into agents. But writing tools for a new MCP server is hard, and so people often propose auto-converting existing APIs into MCP tools; typically using OpenAPI metadata (1, 2). In my experience, this can work but it doesn’t work well."
Or... is it maybe that MCP don't make good API tools! Just use the API as intended...
AGENTS.md
"A simple, open format for guiding coding agents, used by over 20k open-source projects.
Think of AGENTS.md as a README for agents: a dedicated, predictable place to provide the context and instructions to help AI coding agents work on your project."
Craft Beautiful Patterns Backgrounds
"Professional-grade background patterns and gradients. Easily copy the code and seamlessly integrate it into your projects."
MythBusting Large Language Models
"Chatbots can be deceptive. How do LLMs really work under the hood?"
This is a simple tutorial to get a broad understanding of how LLMs work.
🤯 Data thinking
What if "big data" just…isn't worth very much?
The consistently good James Ball: "I'm not saying the emperor has no clothes. I am saying his clothes are cheap, tacky, don't work and are seriously overrated."
Is the logic of the original smoking study valid?
"Out of some curiosity and exercise of my understanding of statistics, I've looked up the original study which statistically proved the causal relationship between smoking and lung cancer."
📈Dataviz, Data Analysis, & Interactive
Moving objects in 3D space
A brilliant interactive exploration.
Don't miss also the commentary on HackerNews: "I was wondering how I can arrange objects along a spherical helix path, and read some articles on it.
I ended up learning about parametric equations again, and make this visualization to document what I learned."
How to draw a Space Invader
Developer Stanko Tadić created this generator and here he explains how it works and how to get good ones.
Which cities have the highest murder rates?
USAFacts: "Memphis, Tennessee, had higher murder rates in its home county than any other major city in the US."
Visualizing dissonance
Datawrapper's Luc Guillemot takes a look at a the topic of sensory dissonance in music.
Spam PACs Raise Money by Deceiving Seniors
"One 89-year-old woman made 7,532 donations totaling $68,666".
Is Rotten Tomatoes Still Reliable? A Statistical Analysis
StatSignificant's Daniel Parris's latest investigation asks: "Can Hollywood's stamp of artistic excellence still be trusted?"
🤖 AI
MIT report: 95% of generative AI pilots at companies are failing
Paywalled article. The title, however, is all you need...
AI Is a Mass-Delusion Event
Sorry, this, too, is paywalled.
Charlie Warzel for The Atlantic: "Three years in, one of AI’s enduring impacts is to make people feel like they’re losing it."
"How many people, I wonder, had to agree that this was a good idea to get us to this moment?"
Dan Hon's LLM commentary
Really good issue of Dan Hon's newsletter: "First, LLMs do a remarkable job of answering some of the questions, like specific questions in history (“Claude’s answer to this question is, in my opinion, astonishingly good, since it leverages the superhuman linguistic and geographic knowledge of LLMs to excellent effect”)
Second, current LLMs do a terrible job of answering essays on topics like “water” and “immediately spin off into BS”, ranging from low-level recitation of facts about water, to... well, sounding like an insufferable Oxford don? Which okay, fine, some of the questions practically invite sounding like an Oxford don, but that’s not entirely the point."
We no longer use vector embeddings for any clinical queries, and won't anytime soon.
Matt Lee on LinkedIn: "One of the main problems we ran into using embeddings on large bodies of text was that it wasn't reliable; they'd often miss short but important statements, or include sections that were semantically similar but irrelevant.
More importantly, we had an extremely limited ability to tweak the performance of embeddings. This not only made debugging really difficult, but limited how much control we have over needle-in-haystack style searches over text. "
DID YOU LIKE THIS ISSUE>? → BUY ME A COFFEE! ![]() You're receiving this email because you subscribed to Quantum of Sollazzo, a weekly newsletter covering all things data, written by Giuseppe Sollazzo (@puntofisso). If you have a product or service to promote and want to support this newsletter, you can sponsor an issue. |
quantum of sollazzo is also supported by Andy Redwood’s proofreading – if you need high-quality copy editing or proofreading, check out Proof Red. Oh, and he also makes motion graphics animations about climate change.