MARGIN OF ERROR….Teagan Goddard sez:
According to a new Pew Research poll, Sen. Barack Obama’s national lead over Sen. John McCain has disappeared. The race is now a statistical tie, with Obama barely edging McCain, 46% to 43%.
This comes via Nick Beaudrot, who claims not only that this is wrong, but that you can go ask Kevin Drum if you don’t believe him. And it’s true. I don’t think the “statistical tie” trope is ever going to go away, but that still doesn’t make it right.
I originally wrote about this back in 2004, but here it is again. The idea of a “statistical tie” is based on the theory that (a) statistical results are credible only if they are at least 95% certain to be accurate, and (b) any lead less than the MOE is less than 95% certain.
There are two problems with this: first, 95% is not some kind of magic cutoff point, and second, the idea that the MOE represents 95% certainty is wrong anyway. A poll’s MOE does represent a 95% confidence interval for each individual’s percentage, but it doesn’t represent a 95% confidence for the difference between the two, and that’s what we’re really interested in.
In fact, what we’re really interested in is the probability that the difference is greater than zero — in other words, that one candidate is genuinely ahead of the other. But this probability isn’t a cutoff, it’s a continuum: the bigger the lead, the more likely that someone is ahead and that the result isn’t just a polling fluke. So instead of lazily reporting any result within the MOE as a “tie,” which is statistically wrong anyway, it would be more informative to just go ahead and tell us how probable it is that a candidate is really ahead. Here’s a table that gives you the answer to within a point or two:
So in the poll quoted above, how probable is it that Obama is really ahead? Pew contacted 2414 registered voters, which means the MOE of the poll is about 2%, and they report that Obama’s lead is 3 percentage points. So go to the top row and then read the number from the 3% column. Answer: there’s a 93% probability that Obama is genuinely ahead of McCain (i.e., that his lead in the poll isn’t just due to sampling error).
Generally speaking, national polls use sample sizes of about 1,100, which translates to an MOE of 3%. State polls often use a sample of 600, which produces an MOE of 4%. Subsets of polls sometimes have MOEs of 5% or higher.
Now, there are plenty of reasons other than sampling error to take polls with a grain of salt: they’re just snapshots in time, the results are often sensitive to question wording or question ordering, it’s increasingly hard to get representative samples these days, etc. etc. But from a pure statistical standpoint, a lead is a lead and it’s always better to be ahead than behind.
ACKNOWLEDGMENTS: Thanks to Nancy Carter and Neil Schwertman, Professors of Mathematics and Statistics at California State University, Chico, for providing me with the formulas used to generate the table and the spreadsheet.