Exploring Relationships with the Census

The folks at the Knight/Mozilla OpenNews Source blog recently asked me to write about a Census topic of my choosing, and I chose to focus on a lesser-traveled piece of Census data: relationships.

The post, Understanding Households and Relationships in Census Data, walks through the definitions the Census Bureau uses for householders and relatives, how it asks the questions and tabulates the results, and some of the key tables that report the data. Thanks to the OpenNews team for letting me dust off my Census know-how!

Enter the Rift: Taking journalism to VR

As I write, my voice is hoarse from three days showing Harvest of Change — a Des Moines Register/Gannett Digital series that used the Oculus Rift and 360-degree video — to hundreds of journalists at the Online News Association conference in Chicago.

The demos capped a two-week sprint that included a media day in New York City, publishing five versions of the software and then catching some media buzz, which alternately praised and scoffed at the effort. Such whirlwinds are fleeting, but highlights are milestones. So, while it’s fresh, here’s a recap.

First, a scene from the Midway at ONA:

That’s Rosental Alves, director of the Knight Center for Journalism in the Americas at the University of Texas at Austin, trying out the project. We set up three Oculus workstations, and for three days the chairs were rarely empty. On the last day, as we packed up, we figured between 400 and 500 people had tried it.

Most people came out of curiosity, or with skepticism, but left impressed. Some were compelled by Amy Webb, who said in a Saturday ONA session that our experience was a must-see. Apparently, we even made the unofficial ONA bingo card.

The story behind this story

The project came together over the summer. When I wasn’t coding backend data for an election forecast, I was heading a small team visiting the dusty back roads of Iowa, both in person and in the Oculus headset. Lots has been written about the Oculus Rift, especially since its acquisition for $2 billion by Facebook, but the focus so far has been on gaming. But after journalism innovation professor Dan Pacheco of Syracuse University introduced us to the Rift, Gannett Digital decided to build its first VR explanatory journalism project. Continue…

App launch: 2014 elections forecast

Election Forecast

 

With more than 1,300 candidates, 507 races, top-line campaign finance data and poll averages for select races, the 2014 midterm elections forecast app we launched in early September is probably the most complex mash-up of data, APIs and home-grown content built yet by our Interactive Applications team at Gannett Digital.

We’re happy with the results — even more because the app is live not only at USA TODAY’s desktop and mobile websites but across Gannett. With the rollout of a company-wide web framework this year, we’re able to publish simultaneously to sites ranging from the Indianapolis Star to my alma mater, The Poughkeepsie Journal.

What’s in the forecast? Every U.S. House and Senate race plus the 36 gubernatorial races up in November with bios, photos, total receipts and current poll averages. For each race, USA TODAY’s politics team weighed in on a forecast for how it will likely swing in November. Check out the Iowa Senate for an example of a race detail page.

Finally, depending on whether you open the app with a desktop, tablet or phone, you’ll get a version specifically designed for that device. Mobile-first was our guiding principle.

Building the backend

This was a complex project with heavy lifts both on design/development and data/backend coding. As usual, I handled the data/server side for our team with assists from Sarah Frostenson.

As source data, I used three APIs plus home-grown content:

— The Project Vote Smart API supplies all the candidate names, party affiliations and professional, educational and political experience. Most of the photos are via Vote Smart, though we supplemented where missing.

— The Sunlight Foundation’s Realtime Influence Explorer API supplies total receipts for House and Senate candidates via the Federal Election Commission.

— From Real Clear Politics, we’re fetching polling averages and projections for the House (USAT’s politics team is providing governor and Senate projections).

The route from APIs to the JSON files that fuel the Backbone.js-powered app goes something like this:

  1. Python scrapers fetch data into Postgres, running on an Amazon EC2 Linux box.
  2. A basic Django app lets the USAT politics team write race summaries, projections and other text. Postgres is the DB here also.
  3. Python scripts query Postgres and spits out the JSON files, combining all the data for various views.
  4. We upload those files to a cached file server, so we’re never dynamically hitting a database.

Meanwhile, at the front

Front-end work was a mix of data-viz and app framework lifting. For the maps and balance-of-power bars, Maureen Linke (now at AP) and Amanda Kirby used D3.js. Getting data viz to work well across mobile and desktop is a chore, and Amanda in particular spent a chunk of time getting the polling and campaign finance bar charts to flex well across platforms.

For the app itself, Jon Dang and Rob Berthold — working from a design by Kristin DeRamus — used Backbone.js for URL routing and views. Rob also wrote a custom search tool to let readers quickly find candidates. Everything then was loaded into a basic template in our company CMS.

This one featured a lot of moving parts, and anyone who’s done elections knows there always are the edge cases that make life interesting. In the end, though, I’m proud of what we pulled off — and really happy to serve readers valuable info to help them decide at the polls in November.

Updates from the Lands of Life & Work

Apologies for the lengthy radio silence. It’s been a busy and complicated couple of months — so busy that I never did write the 2013 year-end wrap I’d planned. Life and work served up some changes from the predictable, and writing fell off the table. A dose of reality.

In the past, each of these nuggets might have been posts of their own, but to get caught up here’s a mix of work and life highlights in the old USA TODAY Newsline format:

Mass killings interactive: After a year-long effort, last December we published an immersive data viz called Behind the Bloodshed: The Untold Story of America’s Mass Killings. Inspired by the events surrounding the Newtown, Conn., school shooting, it lays out the facts about mass killings over the last 8+ years: They happen often and are most often the result of family issues. My team at Gannett Digital collaborated with USA TODAY’s database team, and a post I wrote for Knight-Mozilla OpenNews’ Source blog explains our tech and process. We and our readers were super-happy with the results. We won the journalistic innovation category of the National Headliner Awards and made the short-lists for the Data Journalism Awards and the Online Media Awards.

mk2

 

NICAR 2014: The annual IRE data journalism conference, held in Baltimore this year, was great. About 1,000 attendees made for the largest turnout ever, and a “Getting Started With Python” session I taught was packed (here’s the Github repo). Highlights always include catching up with friends and colleagues, and as usual I focused on sessions with practical takeaways, such as learning more about d3.js and Twitter bots. Chrys Wu, as always, rounds up everything at her site. Next, I’m hoping to catch the IRE conference in San Francisco in June.

Relaunching our platform: For the last four months at work, I’ve taken a detour away from interactives to help our team that’s extending our publishing platform across all our community news and TV station properties. In short, versions of the complete makeover USA TODAY got in 2012 are now appearing on sites ranging from the Indianapolis Star to Denver’s KUSA-TV. It’s more than cosmetic, though, as Gannett Digital’s also moving all the sites to a shared CMS and Django- and Backbone-powered UX. In addition to desktop, there’s all-new mobile web, Android, iPad and iPhone apps. It’s been tiring but rewarding. In the process of personally launching the Wilmington News Journal, Springfield News-Leader, Montgomery Advertiser and several other sites, I’ve gained a better view of the breadth of Gannett’s journalism and found some great opportunities for collaboration.

Other cool work things: While I was relaunching websites, the rest of our interactives team collaborated with USA TODAY’s Brad Heath on his project exploring how felons can escape justice by crossing state lines. I’ve started refactoring the scraper behind our tropical storm tracker to get it ready for the upcoming season. We’ve been bringing Mapbox training to our newsrooms, which has given me the chance to finally dig deeper into TileMill and the Mapbox API. And you might have heard we have some big elections coming up in November. Finally, I recently tried both Google Glass and the Oculus Rift. Check back in five years on whether they’ve changed/saved journalism, but overall the experience reminded me of how I felt when I began using a web browser. Clunky but filled with potential.

Family life: The biggest event of the last while was another detour. At the end of 2013, my wife was hit by a devastating illness that required a lengthy hospital stay and convalescence. In the interests of privacy, I’m not going to post details. But it’s true that these events are life-changers — sitting in the hospital ICU with a first-row seat to a life-and-death drama changes perspectives and priorities quickly. I am sure the lessons we learned from and about people and life will play a big role in how the rest of 2014 plays out for us. (Oh, and before you ask: she’s doing better now.)

Goals for the rest of the year: Between illness and detours, it feels like the year is just getting started. I hope to post more often with Python, data and tech tips. I’ve bought Two Scoops of Django and JavaScript: The Definitive Guide for light summer reading (right), and I continue to plug away on a writing project that I hope to finish soon. And that’s in addition to lots of family and fun stuff we have in sight.

Thanks for hanging in, and please stay in touch!

Ghost Factories: Behind the Project

This is a cross-post of a recent item I wrote for Investigative Reporters and Editors’ On the Road blog. “Ghost Factories” was perhaps the most fun, interesting and well-executed project I’ve done at USA TODAY, largely because the people and process worked so well. This covers all the moving parts:

*  *  *

In April, after USA TODAY published its Ghost Factories investigation into forgotten lead smelters, we heard from several people who wanted to know more about how the project came together — particularly the online package that included details on more than 230 of the former factories.

The following is an expanded version of a post originally sent to IRE’s NICAR-L mailing list:

Alison Young was the lead reporter who conceived the idea for the project. In late 2010, she came to me with a couple of PDFs showing a list of suspected lead smelter sites, which I parsed into a spreadsheet and plotted on a Google map for her to research. Then she started digging, as one of our editors said, “Armed only with faded photographs, tattered phone directories, obscure zoning records, archival maps, fuzzy memories of residents and shockingly incomplete EPA studies.”

Ghost Factories

 
In December 2010, she began filing the first of more than 140 FOIA requests. The requests produced thousands of pages of government documents related to the sites, and to catalog them she created a project inside DocumentCloud. The product was extremely helpful both for organizing documents and for presentation. Brad Heath of our investigative team would later use the DocumentCloud API to integrate metadata from the documents — particularly their titles —  into our database so we could present them online. He also used the API to batch-publish all 372 documents that were included in the project. (He did most of the work using python-documentcloud, a Python wrapper by the Los Angeles Times’ Ben Welsh that makes it easy to interact with the API programmatically.)
Continue…