Chris Blattman

Search
Close this search box.

The other kind of Star Wars: The quest for ** and ***

Top academic journals curate the field, and naturally select papers with significant results. Anticipating this, do researchers inflate the significance of their tests by keeping just the empirical tests with the highest statistics?

A new paper from Brodeur, Lé, Sangnier and Zylberberg:

Using 50,000 tests published between 2005 and 2011 in the AER, JPE, and QJE, we identify a residual in the distribution of tests that cannot be explained by selection. The distribution of p-values exhibits a camel shape with abundant p-values above 0.25, a valley between 0.25 and 0.10 and a bump slightly below 0.05. The missing tests (with p-values between 0.25 and 0.10) can be retrieved just after the 0.05 threshold and represent 10% to 20% of marginally rejected tests. Our interpretation is that researchers might be tempted to inflate the value of those almost-rejected tests by choosing a “significant” specification. We propose a method to measure inflation and decompose it along articles’ and authors’ characteristics.

Screen Shot 2013-04-01 at 9.40.34 AMMy favorite part (emphasis mine):

…we find evidence that inflation is less present in articles where stars are not used as eye-catchers. To make a parallel with central banks, the choice not to use eye-catchers might be considered as a commitment to keep inflation low. Inflation is also smaller in articles with theoretical models, or in articles using data from randomized control trials or laboratory experiments.

 

We also present evidence that papers published by tenured and older researchers are less prone to inflation.

6 Responses

Why We Fight - Book Cover
Subscribe to Blog