Devlog #9

                July 31, 2020

            Devlog #9

            Hi there! This is the super-secret Loud Numbers development log newsletter, which you signed up to at some point in the hopefully-not-too-distant past. 
I’m Duncan Geere, and with my co-host Miriam Quick we’re making a data sonification podcast. This newsletter is a behind-the-scenes glimpse at what that actually looks like on a week-to-week basis. Shall we dive in?

Colour to Key, The Solution
A few weeks ago, we attempted to shift the pitch of the audio samples in our beer sonification based on our beer colour scores, so the lighter the colour of the beer, the higher the pitch. We tried a couple of different methods, neither satisfactory (see Devlog #6, Colour to Key). 
So this week we took the long route, and re-bounced all the samples from Logic in three different keys – shifted a tone up for light-coloured beers, a tone down for dark, and in the original key for the beers in the middle.
We then used if/else statements in Sonic Pi to play the sample corresponding to the colour score for that beer. We grouped the scores into three bins, so that beer colour is either -1, 0 or 1.
if colourscore[beerN] == -1

  xsample = 'path/to/darkbeersample.wav'

elsif colourscore[beerN] == 1

  xsample = 'path/to/lightbeersample.wav'

else

  xsample = 'path/to/midbeersample.wav'

end

play xsample

The results sound much better than before. And limiting the pitch shift to a tone up or down is enough to create contrast between the beers without distorting the character of the sounds.

Public Opinion
This week we also spent a bit of time integrating polling and currency data into our Brexit dataset. The latter will be the bassline, while the post-referendum popularity of “leave” and “remain” among the public will be represented by brass and strings respectively. 
In the process, we had to make a few decisions on how to handle different aspects of the data. The currency data was fairly easy - we converted daily values into weekly values by taking a simple average of the seven days. The polling data was a little more complex. To ensure consistency and methodology, we wanted to use a single source of polling data - YouGov. But YouGov didn’t collect its data at regular intervals - they’re spread out irregularly over the timeframe of the track. 
We could have interpolated between these values, allowing public opinion to shift linearly between the polling times. But music rarely shifts linearly over time. It moves in steps. So we figured it would sound better to have the opinion measure stay constant until the next poll occurs. This increases the musical “impact” of each poll - you can hear when people have their say once more. Though we might have to quantise it a little to match the typical four-beats-in-a-bar structure of most music.

Load Code
To handle this new multidimensional dataset, We’ve reworked our data load code to work with header rows. Header rows are common in data you’d want to import, so we should be able to handle them.
This meant reworking the code so that the data is stored in a hash (Ruby’s name for a JS object or Python dict) rather than an array. That hash has a key matching the name of each column which you have to set manually - along with the normalisation values that you want to use. For each column, there’s an array of data.
It would be nice if the system could automatically detect whether the code has a header row or not and respond accordingly, but doing that would mean making assumptions about what the data and header rows contain and so I’m not sure if it’s possible to make it in a way that wouldn’t occasionally break with weird headers or data. 
Every other data load system I’ve seen asks the user to specify whether it has a header, so I think that’s probably why. There’s nothing like building your own system to understand why other systems are the way they are!

Don't miss what's next. Subscribe to Loud Numbers: