Data Journalism and the Big Picture
The web-o-sphere this week brought forth a collection of opinions on the value of data journalism and the skills that go with it. To wit:
- Tim Berners-Lee, he who invented the World Wide Web, told the Guardian that “journalists need to be data-savvy” and that “data-driven journalism is the future.” The story then goes on to question whether data analysis could ever replace traditional reporting.
- The blog 10,000 Words declared that one of the “5 Myths about digital journalism” is that “journalists must have database development skills” and suggested that most journalists should leave high-level hacking to the experts.
- Another site, FleetStreetBlues, opined that “amidst all this hype, earnestness and spreadsheet-geekery, here’s the truth about so-called ‘data journalism’. It’s still about the story, stupid.”
There’s been a bunch of reaction to these posts, including a few people pointing out a 1986 Time story that sounds similar to the one this week from the Guardian. And therein lies the problem with all three pieces: None of them benefits from a big-picture, historical perspective on data journalism — not where it came from, not how it’s changed and especially not the massive amount of ground the label covers these days.
We used to call it CAR
Back when software came on 5.25-inch floppy disks, or maybe before then, the idea of using a PC to “crunch numbers” was christened “computer-assisted reporting.” These days, we call it data journalism because, along the way, it became obvious the old name was anachronistic. As Phil Meyer once said, we don’t talk about telephone-assisted reporting, do we?
When I got into the game — when Paradox was the desktop database manager of choice — our newsroom had a personal computer designated as the “CAR station.” While others worked on dumb terminals connected to a mainframe, I was surfing the web with Netscape and ringing up Paul Overberg for advice on Census data. I was the newsroom data expert — the guy reporters called when they had a spreadsheet on a disk or an idea to get data from city hall.
In that era — with database-driven web startups like Amazon.com spreading cultural revolution — it was easy to foresee a time when reporters wouldn’t just get the occasional spreadsheet but find themselves inundated with data. Thus was born (at least in my sphere) the drive to evangelize CAR in the newsroom. We taught Excel, we sent people to IRE boot camps, we set up presentations showing the kinds of stories journalists were landing with these skills. The message of CAR was about finding stories and using simple tools to do it: spreadsheets, databases, maps, stats.
Now we call it hacking
Soon enough, though, the craft began to change and so did the talk at IRE CAR conferences — especially in the hands-on classes and demos. In Philadelphia in 2002, the hands-on classes mostly covered Access, Excel, SPSS and, for the adventurous, SQL Server. Just a few years later, in Cleveland and Houston, the offerings included sessions on web scraping, Perl, Python, MySQL and Django.
(more…)