Much has been written over the past week about the DiGrazia et al. paper showing a relationship between a candidate’s tweet share and vote share and the Fabio Rojas op/ed in the Washington Post plugging it. I don’t want to get into a critique of the paper’s methods or findings, as many others have ably done. But I do want to use this as a teachable moment.
In recent years, many academics (including yours truly) have turned to blogging, Twitter, Facebook, and op/ed writing as a way to publicize our academic research. This is generally a good thing. We shouldn’t just be talking to ourselves. Communicating our findings to a broader audience and exposing ourselves to questions and critiques from non-academic readers help improve the relevance and quality of our work and increase the chances that policymakers and practitioners will actually use our knowledge.
It is, however, possible to make mistakes when publicizing our work that may be as damaging as mistakes in the work itself. This recent episode points to three such mistakes:
Misrepresent your findings
The basic finding of the paper, that tweet shares provide some explanatory power in elections that other factors (incumbency, district partisanship, etc.) don’t, is interesting. But Rojas goes beyond that in the op/ed, saying, “In the 2010 data, our Twitter data predicted the winner in 404 out of 435 competitive races.” Except it turns out that the Twitter data didn’t predict that on their own; that came from a multivariate regression analysis. As Blumenthal and Edwards-Levy report, the Twitter data alone only would have called 72% of the 2010 House races correctly. It would have gotten 111 races wrong! You could have done far, far better just knowing who the incumbent was or how Obama did in that district two years earlier.
Oversell your conclusions
The “404 out of 435 competitive races” quote above is actually a clarification. The original op/ed said that they’d correctly called 404 out of 406 competitive races. That would be pretty impressive, except it wasn’t true. Oh, and you really can’t characterize all 435 House races as “competitive,” unless your definition of competitive is “contains at least one candidate.” Which is sort of like defining sex as “at least one naked person in a room.”
Make enemies you don’t need to make
The Rojas op/ed argues:
This new world will undermine the polling industry. For nearly a century, conventional wisdom has argued that we can only truly know what the public thinks about an issue if we survey a random sample of adults. An entire industry is built on this view. Nearly every serious political campaign in the United States spends thousands, even millions, of dollars hiring campaign consultants who conduct these polls and interpret the results. Digital democracy will put these campaign professionals out of work.
So they basically declared war on the polling industry. Which means that thousands of political practitioners just went from not caring about a piece of academic research to wanting to destroy it. Now, of course, sometimes it’s good to make a few enemies, especially if you’re right and they’re clearly wrong. But that doesn’t appear to be the case here. Twitter data may, on the margins, improve our forecast of an election, but that doesn’t obviate polling, and it doesn’t remotely undermine the other valuable functions polling serves, such as knowing how various subgroups are behaving and figuring out what people think about various public policy options and why they think it.
To sum up, the authors had an interesting finding and ended up selling it badly in print and social media. I have no idea if this hurts the paper’s chances of publication in the long run. The authors are essentially arguing that, for political candidates, any publicity is good publicity. We’ll see if that holds for academic papers.
[Cross-posted at Mischiefs of Faction]