DLP Dispatch #4
Hello, happy new year, and welcome to the fourth edition of the Data Liberation Project’s newsletter. Inside: Progress on the EPA Risk Management Program data, the DLP’s first grant-supported project, the latest batch of FOIA requests, DLP metadata, and other news from the data-FOIA-sphere.
Progress on the EPA Risk Management Program data
In DLP Dispatch #3, I shared the news that the Environmental Protection Agency had agreed to (largely) fulfill the Data Liberation Project’s FOIA request for the agency’s Risk Management Program database.
More good news: I have received the EPA’s compact disk in the mail and — with the help of the DLP’s earliest volunteers 🙌 — have begun documenting and processing the data. It’s not too late to contribute; if you’d like to pitch in, do get in touch.
The DLP’s first grant-supported project
And some more good news: Earlier this week, the DLP was named one of DocumentCloud’s initial Gateway Grantees. The grant will support DLP efforts to create a searchable archive of the Federal Emergency Management Agency’s “Daily Operations Briefing” PDFs, to extract the agency’s direct-housing statistics from those PDFs (e.g., see here), and to turn those figures into a proper dataset.
I’ve made some progress on the early parts of the pipeline:
- Open-source code to convert FEMA’s briefing emails into an RSS feed and CSV file
- A new DocumentCloud “Add-On” for using RSS feeds to upload PDFs
- A public DocumentCloud project powered by the components above
- Some initial tinkering with the PDF-parsing code — enough to convince me that a full parse is feasible
A hearty shoutout to Mira Rojanasakul, whose main graphic in this New York Times article inspired this project.
The latest batch of FOIA requests
The holiday season slowed down my FOIA research and filing, so just two new ones to share with you. They seek:
-
Data collected by the Federal Mediation and Conciliation Service on the characteristics of work stoppages (strikes and lockouts) — inspired by a similar request by Forest Gregg, which FMCS denied for reasons that I believe were weakly supported.
You can read more about each request via the links above. If you have any questions about them, please do ask.
DLP metadata
It felt a bit inconsistent that the DLP provided no structured data about its own proceedings. Here’s a first attempt at fixing that: This repository contains tabular data representing each DLP FOIA request, plus each progress-update for each request. It’s the same information you’ll find in webpage form here, minus the request summaries.
Elsewhere in the data-FOIA-sphere
-
Via the Reporters Committee for Freedom of the Press: “A city in northern Oregon has agreed to disclose records showing how much water Google uses to cool its local data centers, bringing to a close a year-long legal dispute that began when the city sued The Oregonian/OregonLive to shield data about its largest water user.” And here’s The Oregonian, reporting on the data: “Google’s water use in The Dalles has nearly tripled in the past five years, and the company’s data centers now consume more than a quarter of all the water used in the city.”
-
In a case related to electronic hospital records, the Kansas Supreme Court found that the state’s Open Records Act does “[require] a public agency, upon request, to provide a copy of a public record in the format in which it maintains that record.” Per the court’s opinion: “The only accurate reproduction of an electronic file is a copy of the electronic file, which can easily be provided by, for example, email or thumb drive.” [h/t Shawn Musgrave + Media Law Resource Center]
That’s all for now! Thank you for reading, and don’t hesitate to reply.
— Jeremy