“On transparency in experiments” or “Why I love monkey business”

The CEGA development blog has a series on transparency in experiments from some academic greats.

The general theme is “we need to have more pre-analysis plans to stop people from drawing wrong conclusions”.

Here, for example, is Don Green:

Not long ago, I attended a talk at which the presenter described the results of a large, well-crafted experiment.  His results indicated that the average treatment effect was close to zero, with a small standard error.  Later in the talk, however, the speaker revealed that when he partitioned the data into subgroups (men and women), the findings became “more interesting.”  Evidently, the treatment interacts significantly with gender.  The treatment has positive effects on men and negative effects on women.

A bit skeptical, I raised my hand to ask whether this treatment-by-covariate interaction had been anticipated by a planning document prior to the launch of the experiment.  The author said that it had.  The reported interaction now seemed quite convincing.  Impressed both by the results and the prescient planning document, I exclaimed “Really?”  The author replied, “No, not really.”  The audience chuckled, and the speaker moved on.  The reported interaction again struck me as rather unconvincing.

…[The] application of Bayes’ Rule suggests that planned comparisons may substantially increase the credibility of experimental results.  The paradox is that journal reviewers and editors do not seem to accord much weight to planning documents.  On the contrary, they often ask for precisely the sort of post hoc subgroup analyses that creates uncertainty about fishing.

There are few naysayers in their series, so allow me to play one.

I agree because of course they are right. Experimental results are more credible if they are pre-specified. And it is easy to pre-specify basic results.

I disagree because I believe: (1) most of the large and unexpected findings from experiments will continue to come from unplanned analysis, and (2) these will lead to the most novel theoretical speculations and exciting future empirical work.

I take a dynamic view: too much emphasis on pre-analysis is not only boring, it slows down advance in the field. We need to think about the long term enterprise, not single experiments. I think economists and political scientists do too little, not too much inductive work.

You might say, “oh but we can do both”. You can have a pre-analysis plan, and you can analyze the dickens out of the data afterwards, so long as you disclose it. This is basically what I do in my papers, and I agree it’s better than the current norm. I wish more authors were as explicit.

What worries me is that some fields, such as psychology and medicine, have gone zonkers and (especially in leading journals) deeply penalize unplanned analysis, even alongside planned analysis. I think this is one reason the field is deadening, boring, and slow to advance. (David Laitin makes a related argument here.)

The current state of social science is also bad: Basically, you can’t really believe anyone’s results because they probably fished for it. Then again, I don’t really believe the pre-analysis results either, because pre-analysis plans so seldom yield what they expect that, when someone says “hey look I am finding results!” I just assume that they were published and 12 less successful pre-analysis plans were not.

The morals of the story: first, believe nothing, and second, make marginal improvements to rigor but don’t let them rule you or lull you into complacency.

Actually, I am not sure this is a disagreement, because it’s entirely possible the full panel would agree with this sentiment. I would be interested to hear.