Statistician Neal Beck just justified my longstanding hatred and loathing of logit

Neal is probably horrified by my slightly inaccurate title, but we all know this blog ain’t the New York Times.

In 2010 I was on sabbatical at NYU’s political science department. Neal asked me why I always used ordinary least squares regressions (OLS) when my dependent variable was a 1 or 0. Why not logit instead of this linear probability model? He had seen economists do this before, and was surprised at my reply—that it had become general practice in a lot of applied economics.

I think the reasons economists often go the linear route is because it generates very simple to interpret estimates, which is not the case with logit. And they are basically correct, with logit unlikely to yield a different answers. Clarity wins in my book.

It interested him enough to run some simulations, and what he found didn’t dissuade me from my sloppy clarity. Several years later, I get an email from Neal titled “At last!” It is a paper. The abstract:

This article deals with a very simple issue: if we have grouped data with a binary dependent variable and want to include fixed effects (group specific intercepts) in the specification, is Ordinary Least Squares (OLS) in any way superior to a (conditional) logit form? In particular, what are the consequences of using OLS instead of a fixed effects logit model in respect to the latter dropping all units which show no variability in the dependent variable while the former allows for estimation using all units.

First, we show that the discussion of fixed effects logit (and the incidental parameters problem) is based on an assumption about the kinds of data being studied; for what appears to be the common use of fixed effect models in political science the incidental parameters issue is illusory.

Turning to linear models, we see that OLS yields a perhaps odd linear combination of the estimates for the units with variation in the dependent variable and units without such variation, and so the coefficient estimates must be carefully interpreted.

The article then compares two methods of estimating logit models with fixed effects, and shows that the Chamberlain conditional logit is as good as or better than a logit analysis which simply includes group specific intercepts (even though the conditional logit technique was designed to deal with the incidental parameters problem!).

Related to this, the article discusses the estimation of marginal effects using both OLS and logit. While it appears that a form of logit with fixed effects can be used to estimate marginal effects, this method can be improved by starting with conditional logit and then using the those parameter estimates to constrain the logit with fixed effects model. This method produces estimates of sample average marginal effects that are at least as good as OLS, and better when group size is small. However, this is based on simulations favorable to the logit setup.

So it can be argued that OLS is not “too bad” and so can be used when its use simplifies other matters (such as endogeneity). These issues are simple to understand, but it appears that applied researchers have not always taken note of these issues.