This is a cross-post of a recent item I wrote for Investigative Reporters and Editors’ On the Road blog. “Ghost Factories” was perhaps the most fun, interesting and well-executed project I’ve done at USA TODAY, largely because the people and process worked so well. This covers all the moving parts:
* * *
In April, after USA TODAY published its Ghost Factories investigation into forgotten lead smelters, we heard from several people who wanted to know more about how the project came together — particularly the online package that included details on more than 230 of the former factories.
The following is an expanded version of a post originally sent to IRE’s NICAR-L mailing list:
Alison Young was the lead reporter who conceived the idea for the project. In late 2010, she came to me with a couple of PDFs showing a list of suspected lead smelter sites, which I parsed into a spreadsheet and plotted on a Google map for her to research. Then she started digging, as one of our editors said, “Armed only with faded photographs, tattered phone directories, obscure zoning records, archival maps, fuzzy memories of residents and shockingly incomplete EPA studies.”
In December 2010, she began filing the first of more than 140 FOIA requests. The requests produced thousands of pages of government documents related to the sites, and to catalog them she created a project inside DocumentCloud. The product was extremely helpful both for organizing documents and for presentation. Brad Heath of our investigative team would later use the DocumentCloud API to integrate metadata from the documents — particularly their titles — into our database so we could present them online. He also used the API to batch-publish all 372 documents that were included in the project. (He did most of the work using python-documentcloud, a Python wrapper by the Los Angeles Times’ Ben Welsh that makes it easy to interact with the API programmatically.)