A group called EGAP (Experiments in Governance and Politics) is considering a proposal to establish a “Pilot Registry for Research Designs” where scholars could register new research projects, specifying in advance the topic, data to be collected, hypotheses to be tested, data analysis to be conducted, and conditions under which the hypotheses would be accepted or refuted. Once the research was conducted and written up, journal editors and referees would have access to the corresponding prospectus in order to verify that the results reported in the paper were not instances of “publication bias” or mindless “fishing” for statistically significant results. Upon publication, or after some pre-specified period of time, the corresponding research prospectus would be in the public domain.
The focus of the pilot proposal is on “Prospective Research” designs, whether experimental or observational, ”for which outcomes have not yet been realized.” That is mostly not what I do. Nevertheless, a friend—perhaps inspired by the proposers’ interest in expanding the system to include “retrospective studies,” and in using experience with the proposed pilot registry to decide “ whether to make registration mandatory for some kinds of research”—asks, as a “thought experiment,” how such a system would affect my work, suggesting as an example my 1996 article on “Uninformed Votes.” My response is below the fold.
I’ve done some papers—including that one—that are sufficiently simple-minded that they would not be much affected by a process of the sort you describe. On the other hand, it is also an example of an instance in which effective enforcement would seem to be impossible. I have the NES cumulative file on my computer; what’s to stop me from ransacking it night and day for random correlations, then submitting the best 5% (suitably dressed up as “hypotheses to be tested”) to the registry? I suppose I could be made to wait for new data, but in this case that moratorium would still be in effect (since I used six presidential elections—and NES might not last long enough to provide six more).
More often, my projects start with questions rather than “hypotheses” and end with findings of varying credibility rather than “accept or refute” decisions. I think of that as a kind of science—indeed, as the most fruitful kind of science we can do given where we are in our understanding of politics. And I think of the credibility of the findings as depending much more on the quality of the data and analysis than on their theoretical provenance or predictability (or, for that matter, their “statistical significance”). I recognize that many other scholars are more rigid in their views (and many others less so), which is fine with me, as long as we don’t have to waste a lot of potentially valuable time debating, or legislating, epistemology. (I assume there would be some journals that would not opt into the “registry” system, and I would stick with those rather than “constrain my implementation of the projects” I work on.)
But you want examples. Since I hosted a workshop on Unequal Democracy a few weeks ago (I am just starting to work on a revised edition), I’ve thought recently about those analyses and can provide a brief, chapter-by-chapter run-down:
1. (Introduction) All pre-existing data in public domain and previously analyzed in similar ways by other scholars. Not sure whether or how this would be covered, since it is “merely” descriptive analysis.
2. The Partisan Political Economy. All pre-existing data in public domain. I set out to look for partisan patterns of income growth, but without any very strong preconceptions about what I would find. Having found them, I tried to make them go away, using a variety of data I didn’t know existed when I started. In the course of attempting to understand where these partisan differences came from, I also discovered some unexpected secondary patterns in the data that seem to me to shed light on that question (honeymoon years vs. non-honeymoon years; first terms following partisan turnover vs. others).
3. Class Politics and Partisan Change. All pre-existing data in public domain. This was an adaptation of previously published work, and shaped in a variety of ways by discussion and criticism of that work (e.g., differing implications of alternative measures of “class”; more elaborate analysis of a variety of potential “wedge” issues in 2004 NES data).
4. Partisan Biases in Economic Accountability. All pre-existing data in public domain. I had worked on “myopia” in economic voting in a separate project with Chris Achen, but did not think to connect it with partisan patterns of income growth until the two papers had sat near each other on my desk for a couple years. I had thought to look at economic voting by income class using NES data, then thought to examine class-specific growth, then thought to examine the effect of high-income growth on other income groups. Your editor and referees would have to decide whether the caveats in the text (“rather remarkably suggest,” “not impossible that the apparent electoral significance of high-income growth is merely a statistical fluke”) and robustness checks in the notes (dropping elections, considering various sub-samples of the data, comparing growth for other groups) were sufficient to make this pattern eligible for reporting.
5. Do Americans Care About Inequality? Mix of old and new data. Entirely descriptive except (perhaps) for interactions between information and ideology (which John Zaller would take as evidence of “polarization,” except that the concept had not previously been applied to “objective” facts).
6. Homer Gets a Tax Cut. Mix of old and new data. The NES module I helped design included some items reflecting my interests and expectations and others suggested by collaborators with their own agendas. The concept of “unenlightened self-interest,” which is the most novel theoretical underpinning of the analyses, was induced from the data rather than derived from anywhere. The analyses of partisanship and information, egalitarian values, and trade-off preferences among taxes, spending, and deficits were all added in response to suggestions from readers of earlier versions.
7. The Strange Appeal of Estate Tax Repeal. Ditto 6, with some historical analysis added subsequent to data collection that made the implications of the chapter in the context of the book quite different than I had in mind at the start.
8. The Eroding Minimum Wage. All publicly available data (aside from some income breakdowns of public polls provided by Marty Gilens). Largely descriptive. Partisan differences (Table 8.2) anticipated; partisan interactions (Table 8.3) unanticipated (and not “statistically significant”); effects of constituency opinion and partisanship on roll call vote anticipated, but only after it occurred to me that I had overlapping data (from Chapter 9).
9. Economic Inequality and Political Representation. All publicly available data. All of the analyses originally assumed a linear effect of income on political influence; the non-parametric specification with three income groups was suggested by readers of an early draft. The analysis of mechanisms (turnout, knowledge, contact) in Table 9.11 was an unanticipated elaboration; not sure where that came from.
10. (Conclusion) No data. Sections headed “Who Governs?” and “Political Obstacles to Economic Equality” could have been anticipated when I started the project; sections headed “Partisan Politics and the ‘Have-Nots’” and “The City of Utmost Necessity” could not (at least by me).
You should feel free to share any of this, if any of it seems helpful.
All the best,
[Cross-posted at The Monkey Cage]