Return to the Office (Expenditures)
House office expenditure data and a Maryland mapping library
I didn’t become a reporter to dig into office spending, but it turns out I’ve been looking at one kind of office spending - from the U.S. House of Representatives - for more than 15 years.
When I first started looking at it, back in the stone ages 2010, the spending records of lawmakers, committees and administrative offices were contained in - you guessed it - a terrible PDF. These documents previously were published as books, so I suppose a PDF was an upgrade, but I never really felt that way.
The good folks at Sunlight Labs used those reports to put together the first public database of House expenditures and the first public database of House staffers, and wrote some fine Ruby code to make that happen.
When Sunlight shut down, that work came to ProPublica, including repositories for parsing the expenditures and the staffers database. By then, the House had finally decided that perhaps PDFs were not the only way to present what definitely is data and also began publishing CSV files.
As fond as I am of that Ruby code, it was split between two formats and relied on access to an API in order to standardize lawmakers using their Bioguide IDs. That API no longer exists, and I’m sadly not writing much Ruby anymore, so it was a good opportunity to get this data back in the public domain.
Introducing House Expenditures, where you can download parsed and enhanced CSV files of official House office expenditures from late 2016 onwards (for now, at least).
Here’s what I added to the original data: that unique Bioguide ID, thanks to the United States Project, plus a standardized member name, party, state and district. You can see the Python code that does the matching and generates the results here. I did have to handle a few special cases via an overrides file because, among other things, the House couldn’t spell Xavier Becerra’s last name correctly on a consistent basis, labeling him “Becarra” for a number of years.
Right now the data doesn’t include the pre-2017 records that are only in PDFs, but that’s on my list, and I’ll keep the data updated going forward.
Mapping Maryland
Sometimes I have original ideas, but there’s nothing wrong with taking something that someone else has done and adapting it for your situation.
So when I saw Kieran Healy’s wonderful nycmaps, an R package that provides analysis-ready sf dataframes of New York City geographies, I thought: I should do that for Maryland. And by I, of course, I mean mostly Claude.
The trick is that New York City is a single, though massive, jurisdiction of its own with sub-jurisdictions. Maryland is kind of like that - it has counties and Baltimore City - but those jurisdictions don’t have the same ultimate authority to answer to when it comes to local geography. The number of city-wide GIS layers in NYC is pretty large: in addition to zip codes, the city has neighborhoods, community districts and many of the very local geographies that make Healy’s package so useful.
But Maryland has enough worth building on, and that’s what I’ve done with - shockingly enough - mdmaps. Like nycmaps, it’s an R package, although currently you can only install it directly from GitHub. Here’s what it offers:
Political jurisdictions, including counties, congressional and state legislative districts.
Census geographies, including ZCTAs, PUMA areas, blocks and tracts.
State Highway Administration engineering districts.
Why do this at all? First, it’ll help with teaching, since our core data journalism class usually has students making maps of Maryland, and while I need them to understand how geometry-aware dataframes work in R, there’s not a lot of magic behind generating a county-level map. It should be easy and consistent. Second, mdmaps uses the Maryland projection for local accuracy. That can be important when you’re making detailed maps of certain areas.
There are plenty of ways to make maps these days, and I’m not going to be able to improve on the JavaScript-based mapping platforms that power a lot of the really beautiful maps you see online. But I can make it easier for Maryland students to analyze and visualize data about the state and its localities. Be on the lookout for more local additions.