Last week, I deployed my first live Django app. Time from start to finish: three years.
Cue the sound of snickers and a thousand eye-rolls. Go ahead. But I confess: From the moment I said, “I want to build something using Django” to the moment I restarted Apache on my WebFaction server and watched the site load for real in my browser, 36 months passed through the hourglass of time.
You see, I got diverted along the way. I’ll tell you why. But first, two things:
1. Learning is wonderful, thrilling, maddening and rewarding. If you’re a journalist and want to see new worlds, let me encourage you to take a journey into code.
2. The site is right here and the code is here. It falls way short in the Awesome Dept., and it will not save journalism. But that’s not why I built it, really.
* * *
The tale began March 2009 in Indianapolis at the Investigative Reporters and Editors Computer-Assisted Reporting conference. That’s the annual data journalism hoedown that draws investigative journalists, app coders and academics for a couple of days of nerdish talk about finding and telling stories with data.
Let’s say you want to generate a few hundred — or even a thousand — flat JSON files from a SQL database. Maybe you want to power an interactive graphic but have neither the time nor the desire to spin up a server to dynamically generate the data. Or you think a server adds one more piece of unnecessary complexity and administrative headache. So, you want flat files, each one small for quick loading. And a lot of them.
A few lines of Python is all you need.
I’ve gone this route lately for a few data-driven interactives at USA TODAY, creating JSON files out of large data sets living in SQL Server. Python works well for this, with its JSON encoder/decoder offering a flexible set of tools for converting Python objects to JSON.
Here’s a brief tutorial:
1. If you haven’t already, install Python. Here’s my guide to setup on Windows 7; if you’re on Linux or Mac you should have it already.
2. In your Python script, import a database connector. This example uses pyodbc, which supports connections to SQL Server, MySQL, Microsoft Access and other databases. If you’re using PostgreSQL, try psycopg2.
3. Create a table or tables to query in your SQL database and write and test your query. In this example, I have a table called Students that has a few fields for each student. The query is simple:
SELECT ID, FirstName, LastName, Street, City, ST, Zip
4. Here’s an example script that generates two JSON files from that query. One file contains JSON row arrays, and the other JSON key-value objects. Below, we’ll walk through it step-by-step.
Briefly, some recaps from my week at the 2012 National Institute for Computer-Assisted Reporting conference, held in late February in St. Louis:
The basics: 2012 marked my 10th NICAR conference, an annual gathering of journalists who work with data and, increasingly, with code to find and tell stories. It’s sponsored by Investigative Reporters and Editors, a nonprofit devoted to improving investigative journalism. Panels ranged from data transparency to regular expressions.
Catch up: Best way to review what you learned (or find out what you missed) is by reading Chrys Wu’s excellent collection of presentation links and via IRE’s conference blog.
Busy times: Our USA TODAY data journalism team served on a half-dozen panels and demos. With Ron Nixon of The New York Times and Ben Welsh of the Los Angeles Times, I led “Making Sure You Tell a Story,” a reminder to elevate our reporting, graphics and news apps. (Here are the slides from me and Ben.) I also joined Christopher Groskopf for a demo of his super-utility csvkit, which I’ve written about. And, finally, I spoke about USA TODAY’s public APIs and how building them helps newsrooms push content anywhere.
Award!: Our team was excited to pick up the second-place prize in the 2011 Philip Meyer Awards for the Testing the System series by Jack Gillum, Jodi Upton, Marisol Bello and Greg Toppo. Truly an honor.
Surprise Award!: At the Friday evening reception, I received an IRE Service Award for my work contributing 2010 Census data to IRE for sharing data with members on deadline and eventually for use in IRE’s census.ire.org site. Colleague and master of all things Census Paul Overberg also was honored, along with the NYT’s Aron Pilhofer, the Chicago Tribune’s Brian Boyer and others. Out of the blue and humbling.
On the Radar: I ran into O’Reilly Radar’s Alex Howard at the conference — the side conversations are always a bonus of these things — and he later emailed me some questions about data journalism. My responses ended up in two pieces he wrote: “In the age of big data, data journalism has profound importance for society” and “Profile of the data journalist: the storyteller and the teacher.”