<?xml version="1.0" encoding="UTF-8"?> <rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" ><channel><title>Anthony DeBarros &#187; News technology</title> <atom:link href="http://www.anthonydebarros.com/category/news-technology/feed/" rel="self" type="application/rss+xml" /><link>http://www.anthonydebarros.com</link> <description>DATA. JOURNALISM. LIFE.</description> <lastBuildDate>Tue, 17 Jan 2012 14:16:00 +0000</lastBuildDate> <language>en</language> <sy:updatePeriod>hourly</sy:updatePeriod> <sy:updateFrequency>1</sy:updateFrequency> <generator>http://wordpress.org/?v=3.3.1</generator> <item><title>Setting up Python in Windows 7</title><link>http://www.anthonydebarros.com/2011/10/15/setting-up-python-in-windows-7/</link> <comments>http://www.anthonydebarros.com/2011/10/15/setting-up-python-in-windows-7/#comments</comments> <pubDate>Sat, 15 Oct 2011 16:36:40 +0000</pubDate> <dc:creator>Anthony</dc:creator> <category><![CDATA[News technology]]></category> <category><![CDATA[Programming]]></category> <category><![CDATA[Python]]></category><guid isPermaLink="false">http://www.anthonydebarros.com/?p=1551</guid> <description><![CDATA[An all-wise journalist once told me that &#8220;everything is easier in Linux,&#8221; and after working with it for a few years I&#8217;d have to agree &#8212; especially when it comes to software setup for data journalism. But &#8230; Many newsroom types spend the day in Windows without the option of Ubuntu or another Linux OS. [...]]]></description> <content:encoded><![CDATA[<p>An all-wise journalist once told me that &#8220;everything is easier in Linux,&#8221; and after working with it for a few years I&#8217;d have to agree &#8212; especially when it comes to software setup for data journalism. But &#8230;</p><p>Many newsroom types spend the day in Windows without the option of Ubuntu or another Linux OS. I&#8217;ve been planning some training around Python soon, so I compiled this quick setup guide as a reference. I hope you find it helpful.</p><p><strong>Set up Python</strong></p><p>Get started:</p><p>1. Visit the official <a href="http://python.org/download/" target="_blank">Python download page</a> and grab the Windows installer. Choose the 32-bit or 64-bit version, depending on your version of Windows 7 (right-click the Computer icon on your desktop and select Properties to find out which one you have). <strong>Note:</strong> Python currently exists in two versions, the older 2.x series and newer 3.x series (for a discussion of the differences, see <a href="http://wiki.python.org/moin/Python2orPython3" target="_blank">this</a>). This tutorial focuses on the 2.x series.</p><p>2. Run the installer and accept all the default settings, including the &#8220;C:\Python27&#8243; directory it creates.</p><p><span id="more-1551"></span><br /> 3. Next, set the system&#8217;s PATH variable to include directories that include Python components and packages we&#8217;ll add later. To do this:</p><ul><li>Right-click Computer and select Properties.</li><li>In the dialog box, select Advanced  System Settings.</li><li>In the next dialog, select Environment Variables.</li><li>In the User Variables section, edit the PATH statement to include this:</li></ul><p>&nbsp;</p><div class="wp_syntax"><div class="code"><pre class="dos" style="font-family:monospace;">C:\Python27;C:\Python27\Lib\site-packages\;C:\Python27\Scripts\;</pre></div></div><p>4. Now, you can open a command prompt (Start Menu|Accessories or Start Menu|Run|cmd) and type:</p><p>&nbsp;</p><div class="wp_syntax"><div class="code"><pre class="dos" style="font-family:monospace;">C:\<span style="color: #33cc33;">&gt;</span> python</pre></div></div><p>That will load the Python interpreter:</p><p>&nbsp;</p><div class="wp_syntax"><div class="code"><pre class="dos" style="font-family:monospace;">Python 2.7.2  <span style="color: #33cc33;">(</span>default, Jun 12 2011, 14:24<span style="color: #33cc33;">)</span> [MSC v.1500 64 bit <span style="color: #33cc33;">(</span>AMD64<span style="color: #33cc33;">)</span>] on win32
Type &quot;help&quot;, &quot;copyright&quot;, &quot;credits&quot; or license <span style="color: #00b100; font-weight: bold;">for</span> more information.
<span style="color: #33cc33;">&gt;&gt;&gt;</span></pre></div></div><p>Because of the settings you included in your PATH variable, you can now run this interpreter &#8212; and, more important, a script &#8212; from any directory on your system.</p><p>Press Control-Z to exit the interpreter and get back to a C: prompt.</p><p><strong>Set up useful Python packages</strong></p><p>1. <a href="http://pypi.python.org/pypi/setuptools" target="_blank">setuptools</a> offers the helpful <a href="http://peak.telecommunity.com/DevCenter/EasyInstall" target="_blank">easy_install</a> utility for installing Python packages. Grab the appropriate version for your system and install.</p><p>2. <a href="http://pypi.python.org/pypi/pip" target="_blank">pip</a> is another package installer that improves on setuptools. Having pip and setuptools will cover most of your installation needs, so go ahead and add pip. Now that you&#8217;ve installed setuptools, you can add pip by typing this at any command prompt:</p><p>&nbsp;</p><div class="wp_syntax"><div class="code"><pre class="dos" style="font-family:monospace;">easy_install pip</pre></div></div><p>Notice that easy_install executes without needing to be told where on the system it&#8217;s located. That&#8217;s the benefit of adjusting your PATH variable earlier.</p><p>3. <a href="http://wwwsearch.sourceforge.net/mechanize/" target="_blank">Mechanize</a> and <a href="http://www.crummy.com/software/BeautifulSoup/" target="_blank">BeautifulSoup</a> are must-have utilities for web scraping, and we&#8217;ll add those next:</p><p>&nbsp;</p><div class="wp_syntax"><div class="code"><pre class="dos" style="font-family:monospace;">pip install mechanize
pip install BeautifulSoup==3.2</pre></div></div><p>4. <a href="http://csvkit.readthedocs.org/en/latest/index.html" target="_blank">csvkit</a>, which I recently <a href="www.anthonydebarros.com/2011/09/11/csvkit-data-files/" target="_blank">covered here</a>, is a great tool for dealing with comma-delimited text files. Add it:</p><p>&nbsp;</p><div class="wp_syntax"><div class="code"><pre class="dos" style="font-family:monospace;">pip install csvkit</pre></div></div><p>You&#8217;re now set to get started using and learning Python under Windows 7. If you&#8217;re looking for a handy guide, start with the <a href="http://docs.python.org/tutorial/" target="_blank">Official Python tutorial</a>.</p> ]]></content:encoded> <wfw:commentRss>http://www.anthonydebarros.com/2011/10/15/setting-up-python-in-windows-7/feed/</wfw:commentRss> <slash:comments>5</slash:comments> </item> <item><title>csvkit: A Swiss Army Knife for Comma-Delimited Files</title><link>http://www.anthonydebarros.com/2011/09/11/csvkit-data-files/</link> <comments>http://www.anthonydebarros.com/2011/09/11/csvkit-data-files/#comments</comments> <pubDate>Sun, 11 Sep 2011 20:48:17 +0000</pubDate> <dc:creator>Anthony</dc:creator> <category><![CDATA[News technology]]></category> <category><![CDATA[Programming]]></category> <category><![CDATA[Python]]></category> <category><![CDATA[Tools]]></category><guid isPermaLink="false">http://www.anthonydebarros.com/?p=1500</guid> <description><![CDATA[If you&#8217;ve ever stared into the abyss of a big, uncooperative comma-delimited text file, it won&#8217;t take long to appreciate the value and potential of csvkit. csvkit is a Python-based Swiss Army knife of utilities for dealing with, as its documentation says, &#8220;the king of tabular file formats.&#8221; It lets you examine, fix, slice, transform [...]]]></description> <content:encoded><![CDATA[<p>If you&#8217;ve ever stared into the abyss of a big, uncooperative comma-delimited text file, it won&#8217;t take long to appreciate the value and potential of <a href="http://csvkit.readthedocs.org/en/latest/" target="_blank">csvkit</a>.</p><p>csvkit is a Python-based Swiss Army knife of utilities for dealing with, as its documentation says, &#8220;the king of tabular file formats.&#8221; It lets you examine, fix, slice, transform and otherwise master text-based data files (and not only the comma-delimited variety, as its name implies, but tab-delimited and fixed-width as well). <a href="https://twitter.com/#!/onyxfish" target="_blank">Christopher Groskopf</a>, lead developer on the Knight News Challenge-winning <a href="https://docs.google.com/present/view?id=dft4sbfd_71fgd4fpg3&amp;pli=1" target="_blank">Panda project</a> and recently a member of the <em>Chicago Tribune&#8217;s</em> <a href="http://blog.apps.chicagotribune.com/author/cgroskopf/" target="_blank">news apps team</a>, is the primary coder and architect, but the code&#8217;s <a href="https://github.com/onyxfish/csvkit">hosted on Github</a> and has a growing list of contributors.</p><p>As of version 0.3.0, csvkit comprises 11 utilities. <a href="http://csvkit.readthedocs.org/en/latest/#usage">The documentation</a> describes them well, so rather than rehash it, here are highlights of three of the utilities I found interesting during a recent test drive:<br /> <span id="more-1500"></span><br /> <strong>csvcut:</strong> Henceforth, this utility will likely meet every csv file I get. To start, it will describe the file contents for me: If I want a quick scan of the column names and their order in the file, I just type:<br /> &nbsp;</p><div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;">csvcut -n filename</pre></div></div><p>The output is an indexed list of column names, assuming the first row of the file is a header row. If not, you get the first row of data, which can be handy as well.</p><p>Still, the &#8220;cut&#8221; part of this utility is its killer feature, extracting columns from the file in the order you choose. You might use this to subset the file before importing to a database or to quickly reorder columns before embarking on analysis.</p><p>To extract the seventh, first and second columns from the file, in that order, it&#8217;s as simple as:<br /> &nbsp;</p><div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;">csvcut -c <span style="color: #ff4500;">7</span>,<span style="color: #ff4500;">1</span>,<span style="color: #ff4500;">2</span> filename</pre></div></div><p>Really nice.</p><p><strong>csvsql:</strong> Send in a csv, and it returns a CREATE TABLE statement for your SQL database. The first time I ran this and saw the result, I did a double-take of pure joy. Then I got slightly depressed thinking about times I wrote code to import 256-column csv files into SQL Server. No more. You just type a statement like this:<br /> &nbsp;</p><div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;">csvsql -i postgresql filename</pre></div></div><p>That produces a CREATE TABLE statement with syntax appropriate to PostgreSQL. Plenty of SQL flavors are available too &#8212; the utility uses <a href="http://www.sqlalchemy.org/" target="_blank">SQLAlchemy&#8217;s</a> dialect collection to offer syntax options for SQL Server, MySQL, Oracle and others.</p><p>Another killer feature: You can add an <code>"inserts"</code> argument to have csvsql generate a SQL INSERT statement for each row of the CSV. Having been flummoxed by SQL Server&#8217;s import wizard more than once, I can tell you that inserting data by row is a great alternative, especially if you&#8217;re trying to isolate a problem row.</p><p><strong>csvstat: </strong>Returns basic descriptive statistics for each column in the file. Results include overall row count, the data type for each column, and descriptives including min, max, sum, median, most frequent values, etc. Very handy for a quick read on what you have in the file.</p><p>Those three jumped out at me, but there are more. Other utilities will convert files to csv, output a csv as JSON, or merge, clean and stack files. The fact you can pipe output from one utility to another creates a powerful scenario.</p><p>It&#8217;s great work and an example of the kinds of <a href="http://blog.thescoop.org/archives/2011/08/10/in-defense-of-building-tools/" target="_blank">tools journalists can build</a> to deal with common problems we face. I&#8217;ll be watching this develop with great anticipation.</p> ]]></content:encoded> <wfw:commentRss>http://www.anthonydebarros.com/2011/09/11/csvkit-data-files/feed/</wfw:commentRss> <slash:comments>0</slash:comments> </item> <item><title>Free Software and APIs: NICAR 2011 slides</title><link>http://www.anthonydebarros.com/2011/02/26/free-stuff-and-apis-nicar-2011-slides/</link> <comments>http://www.anthonydebarros.com/2011/02/26/free-stuff-and-apis-nicar-2011-slides/#comments</comments> <pubDate>Sat, 26 Feb 2011 18:42:45 +0000</pubDate> <dc:creator>Anthony</dc:creator> <category><![CDATA[Journalism]]></category> <category><![CDATA[News technology]]></category><guid isPermaLink="false">http://www.anthonydebarros.com/?p=1218</guid> <description><![CDATA[I had the privilege this week of speaking on two panels at the 2011 Investigative Reporters and Editors Computer-Assisted Reporting* conference in Raleigh, N.C. Here are the slides my co-presenters and I put together: &#8211; &#8220;Free Software: From Spreadsheets to GIS&#8221; with Jacob Fenton of the Investigative Reporting Workshop. Here is part 1, and here&#8217;s [...]]]></description> <content:encoded><![CDATA[<p>I had the privilege this week of speaking on two panels at the 2011 Investigative Reporters and Editors Computer-Assisted Reporting* conference in Raleigh, N.C. Here are the slides my co-presenters and I put together:</p><p>&#8211; &#8220;Free Software: From Spreadsheets to GIS&#8221; with Jacob Fenton of the <em>Investigative Reporting Workshop</em>. Here is <a href="http://bit.ly/eVYmlc" target="_blank">part 1</a>, and here&#8217;s <a href="http://bit.ly/gZBiUf" target="_blank">part 2</a>.</p><p>&#8211; <a href="http://bit.ly/gFjnyK " target="_blank">&#8220;APIs: Making the Web a Data Medium&#8221;</a> with Derek Willis of <em>The New York Times. </em></p><p><em>* Those of us with a few miles on the tires remember that the conference used to go by the name NICAR &#8212; for National Institute for Computer-Assisted Reporting. People still call it that.<br /> </em></p> ]]></content:encoded> <wfw:commentRss>http://www.anthonydebarros.com/2011/02/26/free-stuff-and-apis-nicar-2011-slides/feed/</wfw:commentRss> <slash:comments>0</slash:comments> </item> <item><title>Test Drive: Freebase Gridworks 1.1</title><link>http://www.anthonydebarros.com/2010/06/06/freebase-gridworks-1-1/</link> <comments>http://www.anthonydebarros.com/2010/06/06/freebase-gridworks-1-1/#comments</comments> <pubDate>Sun, 06 Jun 2010 20:44:26 +0000</pubDate> <dc:creator>Anthony</dc:creator> <category><![CDATA[News technology]]></category> <category><![CDATA[Tools]]></category><guid isPermaLink="false">http://www.anthonydebarros.com/?p=660</guid> <description><![CDATA[Update, 11/10/2010: Since I originally reviewed Freebase Gridworks, it has been acquired by Google. It&#8217;s now called Google Refine, and version 2.0 has been released. Original post follows: &#8212;&#8212;&#8211; Data journalists spend lots of time wrestling dirty data, so when I heard the News Applications team at the Chicago Tribune raving about the data-handling abilities [...]]]></description> <content:encoded><![CDATA[<p><strong>Update, 11/10/2010:</strong> Since I originally reviewed Freebase Gridworks, it has been acquired by Google. It&#8217;s now called <a href="http://code.google.com/p/google-refine/" target="_blank">Google Refine</a>, and <a href="http://google-opensource.blogspot.com/2010/11/announcing-google-refine-20-power-tool.html" target="_blank">version 2.0</a> has been released. Original post follows:</p><p>&#8212;&#8212;&#8211;</p><p><strong>Data journalists</strong> spend lots of time wrestling dirty data, so when I heard the News Applications team at the <em>Chicago Tribune</em> <a href="http://blog.apps.chicagotribune.com/2010/05/17/the-gift-of-freebase-gridworks/" target="_blank">raving</a> about the data-handling abilities of <a href="http://code.google.com/p/freebase-gridworks/" target="_blank">Freebase Gridworks</a>, my interest was piqued. Anything that can lessen the pain of cleaning data is worth a closer look!</p><p>Freebase Gridworks is a Java-based app that runs locally in your web browser. The makers&#8217; pitch describes it best:</p><blockquote><p><strong> </strong>&#8230; A power tool that allows you to load data, understand it, clean it up, reconcile it internally, augment it with data coming from <a href="http://www.freebase.com/">Freebase</a>, and optionally contribute your data to Freebase for others to use. All in the comfort and privacy of your own computer.</p></blockquote><p>Installation is simple. I chose to load Gridworks on my Windows XP-based work laptop, although you can download Mac and Linux versions from the <a href="http://code.google.com/p/freebase-gridworks/wiki/Downloads?tm=2" target="_blank">code page</a>. I was up and running in about five minutes, which included loading a new version of Java. Once running, the opening screen looks like so (click for larger version):</p><p><a href="http://www.anthonydebarros.com/wp-content/uploads/2010/06/openscreen.jpg"><img class="alignnone size-medium wp-image-661" style="border: 0pt none;" title="Open Screen" src="http://www.anthonydebarros.com/wp-content/uploads/2010/06/openscreen-300x177.jpg" alt="" width="300" height="177" /></a></p><p>You can open an existing project or create a new one by importing a data file &#8212; and Gridworks hints at its utility by providing options to parse delimited or non-delimited files, limit the import to specific rows, etc. For testing, I grabbed the <a href="http://nces.ed.gov/pubsearch/pubsinfo.asp?pubid=2010310" target="_blank">Academic Libraries: 2008 Public Use Data file</a> from the National Center for Education Statistics &#8212; a tab-delimited text file of about 4,100 rows.<br /> <span id="more-660"></span></p><p>Import was a cinch. Gridworks guessed correctly at the file format and split the columns perfectly:</p><p><a href="http://www.anthonydebarros.com/wp-content/uploads/2010/06/initload.jpg"><img class="alignnone size-medium wp-image-664" style="border: 0pt none;" title="Initial load" src="http://www.anthonydebarros.com/wp-content/uploads/2010/06/initload-300x176.jpg" alt="" width="300" height="176" /></a></p><p>First thing I tried was data cleanup. Some of the cities in the &#8220;CITY_M&#8221; field were in uppercase and some were capitalized normally. Each column header has a menu of manipulation options, so I chose Edit Cells &gt; Common Transforms &gt; To Titlecase:</p><p><a href="http://www.anthonydebarros.com/wp-content/uploads/2010/06/citytrans.jpg"><img class="alignnone size-medium wp-image-670" style="border: 0pt none;" title="City Transform" src="http://www.anthonydebarros.com/wp-content/uploads/2010/06/citytrans-300x177.jpg" alt="" width="300" height="177" /></a></p><p>Gridworks chugged along for a few seconds (a progress bar might be handy), but soon enough it returned all the cities in the correct case. Nice!</p><p>Next, the ZIP_M field (and ZIP also) had a mix of five-digit zips and some with the &#8220;plus 4&#8243; extension. To separate the plus 4&#8242;s into their own field, I chose Edit Column &gt; Split Into Several Columns. It produced this dialog:</p><p><a href="http://www.anthonydebarros.com/wp-content/uploads/2010/06/splitzip.jpg"><img class="alignnone size-medium wp-image-668" title="Split Zip" src="http://www.anthonydebarros.com/wp-content/uploads/2010/06/splitzip-300x176.jpg" alt="" width="300" height="176" /></a></p><p>I opted to split the column by field length and typed the values &#8220;5,4&#8243; for the string lengths. To preserve the leading zeros in the zips and extensions, I unchecked the box &#8220;guess cell type&#8221; to keep the fields as text. Gridworks chugged along again, then produced the result, automatically renaming the fields in the process:</p><p><a href="http://www.anthonydebarros.com/wp-content/uploads/2010/06/splitzip2.jpg"><img class="alignnone size-medium wp-image-669" style="border: 0pt none;" title="Split Zip 2" src="http://www.anthonydebarros.com/wp-content/uploads/2010/06/splitzip2-300x178.jpg" alt="" width="300" height="178" /></a></p><p>Another handy feature of Gridworks is its ability to edit field values en masse. If you hover your mouse over a cell, an &#8220;edit&#8221; button appears:</p><p><a href="http://www.anthonydebarros.com/wp-content/uploads/2010/06/editcell.jpg"><img class="alignnone size-medium wp-image-675" style="border: 0pt none;" title="Edit Cell" src="http://www.anthonydebarros.com/wp-content/uploads/2010/06/editcell-300x176.jpg" alt="" width="300" height="176" /></a></p><p>Clicking it brings up a dialog box where you can change the cell&#8217;s value &#8212; and apply that change to all other cells with the same content. Handy! Here&#8217;s how you could change all the state names of &#8220;AL&#8221; to &#8220;Alabama&#8221;:</p><p><a href="http://www.anthonydebarros.com/wp-content/uploads/2010/06/editcell2.jpg"><img class="alignnone size-medium wp-image-676" style="border: 0pt none;" title="Edit Cell 2" src="http://www.anthonydebarros.com/wp-content/uploads/2010/06/editcell2-300x177.jpg" alt="" width="300" height="177" /></a></p><p>Data cleanup is clearly a strength, but Gridworks also offers plenty of ways to explore data by creating <a href="http://code.google.com/p/freebase-gridworks/wiki/Faceting" target="_blank">facets</a>, or summaries of data (think using COUNT and GROUP BY in SQL). It produces summary tables that let you quickly find all the unique values in a column &#8212; and edit them if you need to create consistency (i.e. company names spelled several ways).</p><p>Finally, Gridworks lets you export your revised data back to Excel or tab/comma-delimited text files, among other options. Very, very useful.</p><p>Judging by its <a href="http://code.google.com/p/freebase-gridworks/wiki/WhatsNew" target="_blank">revision history</a>, Freebase Gridworks is very much an evolving tool but one worth keeping tabs on. This little test drive has probably just scratched the surface of the ways you can use it to standardize your data, but you can get more ideas via the demo videos on the product&#8217;s <a href="http://code.google.com/p/freebase-gridworks/" target="_blank">home page</a>.</p> ]]></content:encoded> <wfw:commentRss>http://www.anthonydebarros.com/2010/06/06/freebase-gridworks-1-1/feed/</wfw:commentRss> <slash:comments>3</slash:comments> </item> <item><title>Minkoff, Data Delvers and Yours Truly</title><link>http://www.anthonydebarros.com/2010/03/08/minkoff-data-delvers-yours-truly/</link> <comments>http://www.anthonydebarros.com/2010/03/08/minkoff-data-delvers-yours-truly/#comments</comments> <pubDate>Tue, 09 Mar 2010 04:11:46 +0000</pubDate> <dc:creator>Anthony</dc:creator> <category><![CDATA[News technology]]></category> <category><![CDATA[Workflow]]></category><guid isPermaLink="false">http://www.anthonydebarros.com/?p=419</guid> <description><![CDATA[Michelle Minkoff, perhaps the hardest-working journalism student I&#8217;ve ever encountered, for the last few months has been writing up a series of interviews with hacker-journalists and newsroom data nerds at her web site. Her subjects include include designers, coders and data lovers of all stripes. Among them are Pulitzer winner Matt Waite of PolitiFact fame, [...]]]></description> <content:encoded><![CDATA[<p>Michelle Minkoff, perhaps the hardest-working journalism student I&#8217;ve ever encountered, for the last few months has been writing up a <a href="http://michelleminkoff.com/category/data-delvers/" target="_blank">series of interviews</a> with hacker-journalists and newsroom data nerds at her <a href="http://michelleminkoff.com/" target="_blank">web site</a>. Her subjects include include designers, coders and data lovers of all stripes. Among them are Pulitzer winner <a href="http://www.mattwaite.com/" target="_blank">Matt Waite</a> of <a href="http://www.politifact.com/" target="_blank">PolitiFact</a> fame, my Gannett colleagues <a href="http://gregorykorte.com/" target="_blank">Gregory Korte</a> and <a href="http://www.tubotu.com/" target="_blank">Matt Wynn</a>, and the St. Paul Pioneer Press&#8217;s <a href="http://michelleminkoff.com/2010/02/20/data-delver-maryjo-webster-pioneer-press/" target="_blank">Mary Jo Webster</a>, whom I worked with for several years at USA TODAY.</p><p><a href="http://michelleminkoff.com/2010/03/08/data-delver-tony-debarros-usa-today/" target="_blank">Now add me to the list</a>. Michelle interviewed me right after one of this winter&#8217;s east coast blizzards, and my cabin fever shows in the sheer verbosity of my responses. But it was fun reliving my early days &#8212; when I discovered the power of merging data and reporting. Here&#8217;s one quote:</p><blockquote><p>A reporter in the newsroom came to me and said, “Hey, it would be  really good if we could figure out what the most valuable properties are  in the city of Poughkeepsie.  And I thought to myself, “You know, this  might be a good opportunity for me to go and make friends with the IT  guy over in City Hall.”  I went over and visited him, he was down in the  basement of City Hall, in the computer room.  Back in those days, they  all had big mainframe computers in an air-conditioned room.</p><p>Actually,  what I first did was I went to the tax assessor’s office, and I said, “I  want a list of all the properties in the city of Poughkeepsie and how  much they’ve been assessed for.”  And they pointed me over to the  corner where there were these big books filled with computer printouts,  and they said, “Well, all the numbers are there, and you can just start  copying them down.”  And I thought to myself, “If they were printed on  this piece of paper that looks like computer paper, then certainly they  are in a computer somewhere in this building.  And I can get that data  on a disk that I can bring over and put into my computer.” And that’s  how I really started figuring out that we can do computer-assisted  reporting by going to the government and getting data.</p><p>That’s what I did.  I went to visit that guy in City Hall, and I  said, “Look, I know you’ve got a file on your computer.  I’d love to  have you put it on this floppy disk for me.”  And he had to check with  the local attorneys, and get their permission, and I called up a  sunshine advocate in New York state and got him to weigh in, and they  agreed that, “Yeah, the law says we can do this.”  The next thing I  know, I had that data on the computer and was going through it in  Paradox.  We wound up writing a couple of stories about different  properties.</p></blockquote><p>A hat tip to Michelle for a smart way to gain insight into our slice of journalism.</p> ]]></content:encoded> <wfw:commentRss>http://www.anthonydebarros.com/2010/03/08/minkoff-data-delvers-yours-truly/feed/</wfw:commentRss> <slash:comments>0</slash:comments> </item> <item><title>The danger of thinking like it&#8217;s 1985</title><link>http://www.anthonydebarros.com/2009/11/23/thinking-like-its-1985/</link> <comments>http://www.anthonydebarros.com/2009/11/23/thinking-like-its-1985/#comments</comments> <pubDate>Mon, 23 Nov 2009 19:49:10 +0000</pubDate> <dc:creator>Anthony</dc:creator> <category><![CDATA[Journalism]]></category> <category><![CDATA[News technology]]></category><guid isPermaLink="false">http://www.anthonydebarros.com/?p=209</guid> <description><![CDATA[For a devout music fan weaned on what&#8217;s now called classic rock, the &#8217;80s were miserable. Sure, we had U2 &#8212; they alone helped ease the pain of hair metal and synthpop. But from an audiophile&#8217;s perspective, for someone who thinks sound is as important as structure, the era made for painful listening. Why? Because [...]]]></description> <content:encoded><![CDATA[<p><strong>For a devout music fan</strong> weaned on what&#8217;s now called classic rock, the &#8217;80s were miserable. Sure, we had U2 &#8212; they alone helped ease the pain of hair metal and synthpop. But from an audiophile&#8217;s perspective, for someone who thinks sound is as important as structure, the era made for painful listening.</p><p>Why? Because most music recorded in the &#8217;80s &#8212; for all its supposed ambition and technical innovation &#8212; sounds more dated, more processed and more fake today than the music of the &#8217;60s and &#8217;70s, including disco. Line up <em>Abbey Road</em> or <em>Dark Side of the Moon</em> next to anything by <a href="http://www.youtube.com/watch?v=B5m24ST7rSw" target="_blank">Duran Duran</a> or <a href="http://www.youtube.com/watch?v=9EHpozHn-QA" target="_blank">Human League</a> and the point is made.</p><p>What hurt &#8217;80s music most was the rush to digital sounds. Musicians grabbed every gizmo they could find &#8212; synthesizers, drum machines, vocal effects, digital guitar processors &#8212; and abandoned their lovely analog gear. When <a href="http://en.wikipedia.org/wiki/Hugh_Padgham#The_.22gated_drum.22_sound" target="_blank">Phil Collins&#8217; engineer figured out how to use a noise gate to make his drums sound as big as a 747</a>, everyone copied. Songs now revolved not around good lyrics or melodies but the sounds of these machines. It all had a big wow factor, but it lacked one important quality:</p><p>None of it was <em>timeless</em>.</p><p>Oh, people thought it was. That&#8217;s what it feels like in the midst of every movement. &#8220;This will last forever.&#8221; Well &#8230;</p><p><span id="more-209"></span>In 1991, the music of the 1980s officially died. That&#8217;s when Nirvana released <em>Nevermind </em>and Pearl Jam exploded with <em>Ten</em>, both featuring a sound that was entirely a return to all that the &#8217;80s had abandoned &#8212; authentic instruments without a lot of gimmicks. Video may have killed the radio star, but it couldn&#8217;t kill what was timeless.</p><p>As a journalist who loves technology, I wonder whether we&#8217;ll look back in 20 years and have a similar take on this first decade of the 2000&#8242;s. The Twittersphere is filled daily with reports of new apps, new sites, new data visualizations. People Tweet every other sentence from conferences where digital gurus explain where the news business is heading, maybe. Having gorged on print profits far too long, we news types are running towards all things digital hoping for a cure for our indigestion.</p><p>A lot of it is interesting, some certainly carries the wow-factor, and some of it is going to be <a href="http://www.documentcloud.org/" target="_blank">truly useful</a>. But how much of it will last? How much is timeless? How can we even tell?</p><p>In the &#8217;80s, pop music became all about the technology and very little about the song. In journalism, we don&#8217;t have songs; we have stories.</p><p>In music, a  great song is timeless. In journalism, a great story is.</p><p>In music, the song transcends the instrument &#8212; it sounds great on guitar or piano or both. In journalism, the story transcends the medium &#8212; you can tell it with photo, graphic, app, text or all.</p><p>But an instrument without a song is nothing. So is a medium without a story.</p><p>I love apps. I love data. I love visualizations. But unless these toys of ours deliver a great story &#8212; one that moves me like the best, most authentic music &#8212; they&#8217;ll have all the lasting impact of <a href="http://www.youtube.com/watch?v=JrBoOd7JQtk" target="_blank">Wang Chung</a>.</p> ]]></content:encoded> <wfw:commentRss>http://www.anthonydebarros.com/2009/11/23/thinking-like-its-1985/feed/</wfw:commentRss> <slash:comments>4</slash:comments> </item> </channel> </rss>
<!-- Performance optimized by W3 Total Cache. Learn more: http://www.w3-edge.com/wordpress-plugins/

Minified using disk: basic
Page Caching using disk: enhanced
Database Caching 4/13 queries in 0.006 seconds using disk: basic
Object Caching 411/426 objects using disk: basic

Served from: www.anthonydebarros.com @ 2012-02-05 05:33:19 -->
