Test Drive: Freebase Gridworks 1.1

Data journalists spend lots of time wrestling dirty data, so when I heard the News Applications team at the Chicago Tribune raving about the data-handling abilities of Freebase Gridworks, my interest was piqued. Anything that can lessen the pain of cleaning data is worth a closer look!

Freebase Gridworks is a Java-based app that runs locally in your web browser. The makers’ pitch describes it best:

… A power tool that allows you to load data, understand it, clean it up, reconcile it internally, augment it with data coming from Freebase, and optionally contribute your data to Freebase for others to use. All in the comfort and privacy of your own computer.

Installation is simple. I chose to load Gridworks on my Windows XP-based work laptop, although you can download Mac and Linux versions from the code page. I was up and running in about five minutes, which included loading a new version of Java. Once running, the opening screen looks like so (click for larger version):

You can open an existing project or create a new one by importing a data file — and Gridworks hints at its utility by providing options to parse delimited or non-delimited files, limit the import to specific rows, etc. For testing, I grabbed the Academic Libraries: 2008 Public Use Data file from the National Center for Education Statistics — a tab-delimited text file of about 4,100 rows.
(more…)

Minkoff, Data Delvers and Yours Truly

Michelle Minkoff, perhaps the hardest-working journalism student I’ve ever encountered, for the last few months has been writing up a series of interviews with hacker-journalists and newsroom data nerds at her web site. Her subjects include include designers, coders and data lovers of all stripes. Among them are Pulitzer winner Matt Waite of PolitiFact fame, my Gannett colleagues Gregory Korte and Matt Wynn, and the St. Paul Pioneer Press’s Mary Jo Webster, whom I worked with for several years at USA TODAY.

Now add me to the list. Michelle interviewed me right after one of this winter’s east coast blizzards, and my cabin fever shows in the sheer verbosity of my responses. But it was fun reliving my early days — when I discovered the power of merging data and reporting. Here’s one quote:

A reporter in the newsroom came to me and said, “Hey, it would be really good if we could figure out what the most valuable properties are in the city of Poughkeepsie. And I thought to myself, “You know, this might be a good opportunity for me to go and make friends with the IT guy over in City Hall.” I went over and visited him, he was down in the basement of City Hall, in the computer room. Back in those days, they all had big mainframe computers in an air-conditioned room.

Actually, what I first did was I went to the tax assessor’s office, and I said, “I want a list of all the properties in the city of Poughkeepsie and how much they’ve been assessed for.” And they pointed me over to the corner where there were these big books filled with computer printouts, and they said, “Well, all the numbers are there, and you can just start copying them down.” And I thought to myself, “If they were printed on this piece of paper that looks like computer paper, then certainly they are in a computer somewhere in this building. And I can get that data on a disk that I can bring over and put into my computer.” And that’s how I really started figuring out that we can do computer-assisted reporting by going to the government and getting data.

That’s what I did. I went to visit that guy in City Hall, and I said, “Look, I know you’ve got a file on your computer. I’d love to have you put it on this floppy disk for me.” And he had to check with the local attorneys, and get their permission, and I called up a sunshine advocate in New York state and got him to weigh in, and they agreed that, “Yeah, the law says we can do this.” The next thing I know, I had that data on the computer and was going through it in Paradox. We wound up writing a couple of stories about different properties.

A hat tip to Michelle for a smart way to gain insight into our slice of journalism.

The danger of thinking like it’s 1985

For a devout music fan weaned on what’s now called classic rock, the ’80s were miserable. Sure, we had U2 — they alone helped ease the pain of hair metal and synthpop. But from an audiophile’s perspective, for someone who thinks sound is as important as structure, the era made for painful listening.

Why? Because most music recorded in the ’80s — for all its supposed ambition and technical innovation — sounds more dated, more processed and more fake today than the music of the ’60s and ’70s, including disco. Line up Abbey Road or Dark Side of the Moon next to anything by Duran Duran or Human League and the point is made.

What hurt ’80s music most was the rush to digital sounds. Musicians grabbed every gizmo they could find — synthesizers, drum machines, vocal effects, digital guitar processors — and abandoned their lovely analog gear. When Phil Collins’ engineer figured out how to use a noise gate to make his drums sound as big as a 747, everyone copied. Songs now revolved not around good lyrics or melodies but the sounds of these machines. It all had a big wow factor, but it lacked one important quality:

None of it was timeless.

Oh, people thought it was. That’s what it feels like in the midst of every movement. “This will last forever.” Well …

(more…)

Anthony

About me

I'm a journalist who works with words, code and data. I'm also a husband, father, musician, gardener and occasional poet. I love finding and telling great stories. I'm inspired by art, music and design that elevate. I pursue the truth. Data journalism's the focus here, but other topics will crop up. Thanks for reading.
LINKS & TRIVIA

microblog

places

LinkedIn
Twitter
Delicious
Tumblr
Spelling Bee blog
- Tales from our trips to the Scripps National Spelling Bee
360 Sports Jam
- My teen journalist's sports blog

data journalists

Brian Boyer
Aaron Bycoffe
Jack Gillum
Gregory Korte
Aron Pilhofer
Mark Schaver
Matt Waite
Ben Welsh
Derek Willis
Matt Wynn

from the home office

RSS TOP USA TODAY STORIES