Sorting Data in Excel: Simple Analysis

Sorting a data set helps answer a basic question journalists like to ask: “Which ____ has the highest (or lowest) ______?”

Excel (and other spreadsheets such as the open source Calc) make sorting data easy. In fact, I often make sorting my first step when “interviewing” data because it quickly reveals high and low values and often highlights some that may seem questionable.

Let’s work through a simple sort in Excel. I’ll be using Excel 2007, but older versions have similar functions. Start by downloading the file “sorting.xls” and saving it to your computer. Open it and follow along:

1. We have a table of Census data from the 2006-2008 American Community Survey. It shows the median age of the population for each of 79 school districts in Virginia plus the state itself.

We want to know which district has the oldest and youngest populations. Let’s sort it!

2. Click once on one cell anywhere in the table. This will help Excel auto-discover your table in the next step.

(more…)

Mean vs. Median: A Beginner’s Guide

A common way to summarize a group of numbers — one most of us learned in grade school — is to find its mean, commonly called the average. But it’s not always the best measure.

Let’s say six kids go on a field trip, ages 10, 11, 10, 9, 13 and 12. It’s easy to add the ages and divide by six to get the group’s average age:
 

(10 + 11 + 10 + 9 + 13 + 12) / 6 = 10.8

Because all the ages are close, the average of 10.8 gives us a good picture of the group as a whole. But averages are less helpful when the values are skewed toward one end or if they include outliers.

For example, what if we add a much older chaperone to our field trip? With ages of 10, 11, 10, 9, 13, 12 and 46, the average age of the group rises considerably:
 

(10 + 11 + 10 + 9 + 13 + 12 + 46) / 7 = 15.9

Now the mean is not an accurate representation. The outlier skews the average, and no journalist should feel comfortable reporting it.

This is where calculating a median is handy. The median is the midpoint in an ordered list of values — the point at which half the values are higher and half lower. If the median household income in East Middletownburg is $50,000, then half the households earn more and half less.

(more…)

Adjusting for inflation: A beginner’s guide

When Daniel Craig hit theaters last year in Quantum of Solace, the 22nd film in the James Bond spy series, his ability to dispatch bad guys (and charming good looks, no doubt) helped it earn $168.4 million. That was enough to rank Solace among the top 10 grossing films of 2008.

But how did Solace fare against the rest of the Bond canon, which stretches back to 1963′s Dr. No? The answer depends on whether you adjust for inflation.

We all know that the price of a loaf of bread isn’t what it used to be. The cost of consumer goods tends to rise each year, except during downturns or various calamities. So, taking inflation (or deflation) into account is the only way to  meaningfully compare dollar amounts over time.

There are plenty of apps just for this. The Bureau of Labor Statistics offers one basic calculator, and there’s another at this site. They’re fine for a quick check, but I’d rather do my own calculations. A web app might not have the latest data. And if you’re adjusting more than a couple of amounts, using a spreadsheet will save time. Here’s an exercise from Bond-land:

(more…)

Percent change: Know the formula

Here’s a question I posed to some college students recently:

Let’s say you cover the Town of East Middleburgtown. The mayor announces that this year’s town budget comes in at $12.6 million. Last year’s budget was $11.4 million. What is the percent change? Better yet, what’s the formula for figuring it out?

If you don’t know the answer, or how to obtain it, you’re not alone. This kind of problem — which is in my son’s 7th grade math textbook — routinely stumps most journalists in most of the newsrooms across America.

I’ll avoid the temptation to moralize here. If you’re a journalist — if you have a pulse — you need to know this very basic operation. With it, you’ll have the power to analyze all kinds of data and even double-check the mayor’s math.

Here it is:
 

(the_new_number - the_original_number) / the_original_number

or, in the case of East Middletownburg:
 

(12.6-11.4) / 11.4

Remember (and you learned this in fifth grade) that operations in parentheses come first. That gives you this:
 

1.2 / 11.4 = .105 = 10.5%

So, the mayor’s new budget is a 10.5% increase over last year’s. Now you have something to write about!

Excel: Combine text and formulas in a cell

Whenever I analyze data in Excel, I format the spreadsheet to make it easier to read. A little attention to fonts, boxes and shading can help people understand the key data faster.

One way to give yourself some flexibility with formatting is to combine text and the results of a formula in a single cell. Just use the “&” operator to concatenate the text and the formula.

Consider this formula:

="Quantity: "&SUM(A1:A20)

Enter it into a cell, press enter and (assuming you have numeric values in cells A1 through A20) it will present this result in a single cell:

Quantity: 23

That kind of output’s pretty handy when you want to create a worksheet in your spreadsheet that aggregates data from other sheets while keeping the formatting simple.