The Use and Abuse of Bar Graphs

Ken Schultz, a political scientist at Stanford, was inspired by the misleading Wall Street Journal graphic and disappeared Tax Foundation blog post to illustrate just how easy it is to manipulate bar graphs by changing the boundaries of the bins:

I thought it would be an interesting exercise to see how easily someone without scruples could twist the same data to support whatever argument they wanted to make about the distribution of taxable income (and, by implication, the proper targets for taxation). The attached file presents four graphs using the same data to depict the income distributions four different ways. This way, people can pick their preferred tax policy and then select the graph that supports their pick. No need for data to constrain your policy prescriptions!

And here are the graphics Schultz made using the same IRS data as the Journal (see Kevin Drum for a similar approach):

Schultz1

Schultz2

Schultz3

Schultz4

In short, it’s possible to draw almost any conclusion you want from the data if you mess with the bin sizes enough. That doesn’t mean that all bar graphs are equally valid, however. Some readers of my previous post have been arguing that the Journal’s original graphic wasn’t misleading, but that’s wrong when it is considered in the context of the editorial. Here’s the relevant passage:

The rich, in short, aren’t nearly rich enough to finance Mr. Obama’s entitlement state ambitions—even before his health-care plan kicks in.

So who else is there to tax? Well, in 2008, there was about $5.65 trillion in total taxable income from all individual taxpayers, and most of that came from middle income earners. The nearby chart shows the distribution, and the big hump in the center is where Democrats are inevitably headed for the same reason that Willie Sutton robbed banks.

WSJ chart

There are many problems with the editorial’s logic, but the relevant one here is the idea that the graph proves that most taxable income comes from “middle income earners.” That’s empirically false if you define “middle income” to mean the middle of the income distribution. The peak of the “hump in the center” of the Journal’s own graphic is for people who make $100,000-$200,000, but as the Journal notes, the top 10% (including joint filers) make $114,000 and above. That’s not the middle unless you stretch out the distribution (as the Journal did) by including numerous bins for the very small number of people making over $200,000. In reality, the top 20% earned 50% of all money income in 2009 (PDF; see Table 3), with the top 5% taking home 22%. The middle quintile — the true “middle income earners” — made a whopping 15%.*

In fairness to the Journal, I should note (as Schultz points out via email) that the IRS data the Journal used (Excel spreadsheet) is grouped into the same income ranges included in their bar chart. They didn’t change the bin sizes to fit their preferred conclusion, but they did plot it in a way that misrepresented the shape of the US income distribution across the population.

* The Census data are money income, not total taxable income, but the conclusion holds.

[Cross-posted at Brendan-Nyhan.com]

Brendan Nyhan

Brendan Nyhan is an assistant professor of government at Dartmouth College.