Hello thereeee,
This is the first issue of my newsletter, and you're one of the very first people to ask for it, so for that, I thank you! February always seems to pull the rug from under me when it comes to assuming how much time I have. I always think I'll push doing something to the last week of February, but before I know it the 28th has annoyingly sprung up on me, grinning like it knew all along I'd be caught short. So here we are, on March 2nd instead of 'the end of February', like I promised myself.
Here are some things that I've been up to this month.
I recently found out that the CBFC (Central Board of Film Certification) has a website which lists all modifications and cuts made to any movie being released in India. Exploring that data is fascinating because it is a glimpse into the varying sensibilities that are applied to what we can watch. Here's a sample of this data so far, cleaned up, the most interesting column is 'description' which tells you what modification was made.
From (variably) removing violence to stuff like Reduced the visuals of students burning the news papers and protesting in the college
, there's a lot of stuff they deal with. We have plans to make a regularly updated explorer where you'll be able to browse any movie and see how it was modified before it reached out.
In the public Discord channel for this project, we've been discussing how we're scraping this data (like brute-forcing and decoding movie IDs). It is a fun puzzle-solving exercise, but I can't want to analyse it soon. This is the boring part.
The biggest learning here has been how important filtering is for effective data analysis and visualization. There's a lot of data here, in the form of text and types of media the CBFC covers and what not, but the key to making sense of it is deciding what makes sense to keep. For example, the dataset also has modifications for ads, music videos, and trailers. Filtering based on the duration of the type of media saves me a lot of trouble:
For classifying the type of modification, just filtering certain words out of the descriptions gives me more structured metadata to work with. No fancy LLM classification needed.
Apart from the data but adjacent to it, learning how to filter Google searches smartly shows me documents I might have missed otherwise:
site:*.gov.in filetype:pdf film certification inurl:report
This is called Google Dorking! You can read more about it here.
I will write some more on these things later this month if time allows. The last one, especially, is sometimes really useful.
In a conversation with my friend Vivek, I said something similar to:
Delhi: 26C, BLR: 31C yesterday, this city is gone.
To which he replied to me with a picture of a chart with the average temperatures of both the cities for this time of year, saying:
Delhi is significantly above it's mean daily maximum, Bangalore is not. Now you can make dataviz showing citywise temperature variation against historical mean.
And well, why not? The Indian Meteorological Department releases hefty 'climatological tables' for almost each major city and town (you can sometimes find these tables in Wikipedia pages for that city, like for Bangalore), which tell you what the average min and max temperatures for that time of the year are, average cloud cover, humidity and so on. We're creating a small site which will let you find your city and answer the basic question, “Is my city hotter/colder/drier/wetter/windier than usual?” This has also been a deep dive into all the different portals that the IMD has for reporting different kinds of variables, and I found neat stuff like this ultra-maximalist 'meteogram', which is super cute because look at how their cloud cover chart looks like a skyline. You don't have to understand it, look at the clouds:
The design for this explorer will be very telegram-inspired. Here's a look at the site so far, with the random photo of the telegram I ripped off.
Vivek has already cleaned up and assembled the historical data from IMD if you'd like to see it.
I watched Nosferatu a few weeks ago, and having only read the original Dracula, and I didn't know this changes the location of the entire story from London to somewhere in Germany. I somehow got to reading more about the time and politics within which Dracula was written, which is always an interesting exercise with books of that period and which are as well-known. Published in 1897, at the turn of the century and the height of the British Empire, Dracula essentially embodied the “Eastern Other” invading the heart of the Empire from backward, remote and uncivilized Transylvania (which is in Eastern Europe). I then found this wonderful paper which talks about how Stoker, who never stepped foot into Transylvania himself, was not describing but inventing it with the prevalent 'imperial and colonial geographic imagination' of the East. Over a hundred years later, the image has stuck forever. As the paper notes:
"Clearly, Romania is not represented in the way that it would choose to represent itself but instead in the way that the West chooses to represent it.... As such, Transylvania can be identified as a site of cultural struggle between Western representations of the region as the home of the strange and the supernatural and Romania’s efforts to define itself in its own way and on its own terms."
I highly recommend reading this paper, and some other works by the author who writes on cultural aspects of tourism in Romania and 'human geographies'. If you visualize Transylvania in your head, I'm curious about what you imagine. Does it conjure up images of a scraggly landscape with castles and monsters? It does for me. And it can all be traced back to this one book.
Related: This article by The Pudding, “Do writers write where they know?”.
Speaking of fictional geographies, have a look at this detailed map of Springfield from the Simpsons. I love stuff like this. Springfield is an abstraction of the AnyTown, USA, and this map is a condensed, spatial summary of that abstraction.
That's all from me this month, see you in a few weeks!
Aman
PS: Did you like something? Was this boring? Want to discuss something related to any of this? You are welcome to reply to this email! I'd love to hear from you.