Andy Gelman responds to yesterday’s matching rant:
I see what Chris is getting at–matching, like regression, won’t help for the variables you’re not controlling for–but I disagree with his characterization of matching as a weighting scheme. I see matching as a way to restrict your analysis to comparable cases. The statistical motivation: robustness. If you had a good enough model, you wouldn’t neet to match, you’d just fit the model to the data. But in common practice we often use simple regression models and so it can be helpful to do some matching first before regression. It’s not so difficult to match on dozens of variables, but it’s not so easy to include dozens of variables in your least squares regression. So in practice it’s not always the case that “you are simply matching based on those same X. To put it another way: yes, you’ll often need to worry about potential X variables that you don’t have–but that shouldn’t stop you from controlling for everything that you do have, and matching can be a helpful tool in that effort.
Beyond this, I think it’s useful to distinguish between two different problems: imbalance and lack of complete overlap. See chapter 10 of ARM for further discussion. Also some discussion here.
This sounds right to me. When I think of matching as a weighting scheme, it’s weights that downplay or eliminate cases that have little business being compared. Maybe that’s incorrect in a technical sense, but I find it a useful way to think about the technique.
One of the downsides of matching when there are unobservables involved, is that you might be discarding observations that have very good reason to be compared. In this case you might consider showing both the matching and regression results, and letting the readser judge for themselves. Better yet: make those unobservables observable, or look for opportunities to answer the question through natural experiments.
The cardinal sin of political science dissertations? Jumping to a case and data collection before you have a research design. You should have your research design, tests for your assumptions, and a complete write-up plan before you ever collect a single variable. One of the great things about randomized control trials is not that they eliminate many sources of bias, but they force you to follow the scientific method (to a degree).