550: quantum of sollazzo
#550: quantum of sollazzo – 30 Jan 2024
The data newsletter by @puntofisso.
Hello, regular readers and welcome new ones :) This is Quantum of Sollazzo, the newsletter about all things data. I am Giuseppe Sollazzo, or @puntofisso. I've been sending this newsletter since 2012 to be a summary of all the articles with or about data that captured my attention over the previous week. The newsletter is and will always (well, for as long as I can keep going!) be free, but you're welcome to become a friend via the links below.
We have some great new sponsored content: Ed Freyfogle, organiser of location-based service meetup Geomob, co-host of the Geomob podcast, and co-founder of the OpenCage, has offered to introduce a set of points around the topic of geodata. His first entry starts a few paragraphs below on the addresses and why they can be a nightmare.
The most clicked link last week was this incredibly detailed Tableau viz of crime in London boroughs.
'till next week,
Giuseppe @puntofisso
Before you go... DO YOU LIKE QUANTUM OF SOLLAZZO? → BECOME A SUPPORTER! :) If you enjoy this newsletter, you can support it by becoming a GitHub Sponsor. Or you can Buy Me a Coffee. I'll send you an Open Data Rottweiler sticker. You're receiving this email because you subscribed to Quantum of Sollazzo, a weekly newsletter covering all things data, written by Giuseppe Sollazzo (@puntofisso). If you have a product or service to promote and want to support this newsletter, you can sponsor an issue. |
✨ Topical
Global Warming Picks Up Speed
"Here are yearly average temperature anomalies for the whole planet, from 1950 through 2023, according to HadCRU, the Hadley Centre/Climate Research Unit in the U.K."
Obviously, this only captures the average, which shows part of the problem but not the full complexity with ever increasing variance of weather events.
Trump’s biggest Iowa gains are in evangelical areas, smallest wins in cities
Analysis by the Washington Post: "Trump dominated the caucuses in the style of other Republican winners of the past 20 years, a pattern that works in Iowa but did not propel them to win the nomination. Meanwhile, Trump’s weakest performance was in the parts of Iowa that more closely resemble the rest of the country, with fewer White evangelical Christians, fewer farmers and more people living in cities with higher education and more income."
What are the top US exports to China?
From USAFacts. I wasn't expecting this: "Soybeans were the nation’s top export to China in 2022, making up 11.6% of overall export value."
World's Biggest Data Breaches & Hacks
"Selected events over 30,000 records stolen" by Information is beautiful, 2004-2024.
The bitcoin ETF saga
Masterly illustrated by Axios.
Good news: Open, global datasets like OpenStreetMap make getting lots geodata easier than ever.
Bad news: Now you have to sort through it, which can be an i18n nightmare.
Example: Given this geodata for a location in Spain - which address would a normal person expect?
"components": {
"ISO_3166-1_alpha-2": "ES",
"ISO_3166-1_alpha-3": "ESP",
"ISO_3166-2": [
"ES-CT",
"ES-B"
],
"_category": "building",
"_type": "building",
"city": "Barcelona",
"city_district": "Sarrià - Sant Gervasi",
"continent": "Europe",
"country": "España",
"country_code": "es",
"county": "Barcelonés",
"county_code": "B",
"house_number": "68",
"neighbourhood": "les Tres Torres",
"political_union": "European Union",
"postcode": "08017",
"road": "Carrer de Calatrava",
"state": "Cataluña",
"state_code": "CT",
"state_district": "Barcelona"
},
At OpenCage we’ve open-sourced the templates we use to convert address data into well formatted strings for the 240+ territories around the world, so we know the correct answer is Carrer de Calatrava, 68, 08017 Barcelona, España
. This is just one of many small steps we’ve taken to make developer’s lives easier.
Anyone looking for an entertaining view of the technical complexity of addresses should read “Falsehoods programmers believe about addresses”. Meanwhile, in the category of not sure whether to laugh or cry, we have last year's news of the German town that voted “no” to adopting street names.
If your project calls for well-formatted addresses, give the OpenCage geocoding API a try.
🛠️📖 Tools & Tutorials
Machine Learning Engineering Open Book
"An open collection of methodologies to help with successful training of large language models and multi-modal models. This is a technical material suitable for LLM/VLM training engineers and operators. That is the content here contains lots of scripts and copy-n-paste commands to enable you to quickly address your needs."
Maps & GIS in the browser
"Atlas is how teams make maps and perform geospatial analysis together. Create, collaborate, share — all under one roof."
Not a free service, but with a free tier.
US Census Geocoder
"Census geocoder provides interactive & programmatic (REST) access to users interested in matching addresses to geographic locations and entities containing those addresses."
I hadn't come across this service before, it's good if you're planning to create dataviz about US stats.
(via Jeremy Singer-Vine's Data Is Plural)
Reading QR codes without a computer
If you really want to... :-) But it's a very good and well illustrated example of how data encoding works.
CSS3D Clouds
"An experiment on creating 3d-like clouds with CSS3 3D Transforms and a bit of Javascript."
They look pretty good!
Creating with Data
A website with a few examples, some of which run off this Glitch page.
Felt
"A better way to work with maps. Powerful enough for GIS Pros, easy enough for everyone else."
Another make-maps-in-the-browser tool, this too with a free tier. It looks like it offers advanced GIS features.
Base Python Rgonomic Patterns
"Getting comfortable in a new language is more than the packages you use. Syntactic sugar in base python increases the efficiency, and aesthetics of python code in ways that R users may enjoy in packages like glue and purrr. This post collects a miscellaneous grab bag of tools for wrangling, formatting (f-strings), repeating (list comprehensions), faking data, and saving objects (pickle)"
AI for Web Devs: Project Introduction & Setup
"In this blog post, we start bootstrapping a web development project using Qwik and get things ready to incorporate AI tooling from OpenAI."
🤯 Data thinking
Muscle memory in data work
"*For much of my early career working as a data analyst, I had a peculiar habit – for daily ad hoc DB queries for random questions I'd get throughout the workday, I always rewrote the queries from scratch whenever possible."
📈Dataviz, Data Analysis, & Interactive
Hoodmaps
Hoodmaps, first introduced on Reddit 6 years ago is a crowdsourced map of neighbourhoods: "I always have trouble getting an overview of how a city is made up and which areas to go and which to avoid. So I made a map that everyone can draw colors on. Each color represents a simplified category of an area, like "hipster" or "suits". It's crowdsourced which means it shows everyone's contributions and averages them."
It reminds me of this paper from over a decade ago by a group of researchers who used Flickr tags to redesign neighbourhood boundaries.
(via Anantharaman Iyer)
Waterway Map
A map showing "how are waterway connected in OpenStreetMap".
How we made an animated movie in 8kB
This blog post explains it: "In November 2022, we set ourselves a challenge: make a real-time animation that looks like a standard short animated movie, with the constraint that it should fit in 8 kilobytes. The goal was to have decent graphics, animations, direction and camera work, and the matching music".
The source code is also available on GitHub.
Startup Funding Simulator
This is pretty good at showing the impact of dilution.
Visualizing vernacular variety in various vinyls
Datawrapper's Jack Goodall: "I wanted to try to visualize not only the absolute size of my favorite musicians’ vocabulary, but also how unique their lyrics are relative to other artists’."
🤖 AI
Prediction for journalism in 2024 — More open source AI
Views by Burt Herman, Hacks/Hackers co-founder and board chair.
"We have barely scratched the surface of how this new way to interact with machines using simple language will supercharge human capabilities. AI developments relating to journalism will take a few directions in the coming year."
New Theory Suggests Chatbots Can Understand Text
"Far from being “stochastic parrots,” the biggest large language models seem to learn enough skills to understand the words they’re processing."
Mmm...
quantum of sollazzo is also supported by Andy Redwood’s proofreading – if you need high-quality copy editing or proofreading, check out Proof Red. Oh, and he also makes motion graphics animations about climate change.
Supporters*
Alex Trouteaud
casperdcl
[*] this is for all $5+/months Github sponsors. If you are one of those and don't appear here, please e-mail me