IPA’s weekly links

Guest post by Jeff Mosenkis of Innovations for Poverty Action.

From http://www.sciencemag.org/content/349/6251/aac4716.full

47% of original effect sizes were in the 95% confidence interval of the replication effect size; 39% of effects were subjectively rated to have replicated the original result; and if no bias in original results is assumed, combining original and replication results left 68% with statistically significant effects.

  • Headlines are shouting “fewer than half of psychology studies replicate,” but Vox’s reporter says after talking with one of the researchers about how hard it was to set up the identical experiment (in a different country/language), she was surprised anything replicates at all. Failure might not mean the original finding was a fluke, it could just mean they didn’t manage to measure the same thing the same way.
  • A prominent researcher says the NYTimes misquoted him on this story to make him sound more skeptical than he is, and posted his own notes from the conversation, a great lesson for any researcher being quoted.
  • Bruce Wydick has a interesting post on what academic and faith-based development practitioners can learn from each other.
  • In the US, subsidies for people to move from poor neighborhoods to bettehr ones had positive long term outcomes (PDF) but Barnhardt, Field, & Pande find in India, taking people away from their social support networks is very disruptive and many people refused or left the program (PDF).

And a new visualization tool lets you play with data to show how moving data points (even one outlier) changes a correlation and variance overlap.


Kenya crime and police bleg

What organizations and donors are doing interesting or innovative programs in policing tactics, police/justice reform, or crime reduction in Kenya?

Mainly I’m looking at crime and gangs but I’m also interested in police relations with Somali refugee populations in the capital.

Contacts in the police and security sector are welcome as well. I’m especially interested in meeting internal reformers.

I can be emailed here.

Kenya food bleg

I’m headed to Nairobi this evening. I haven’t been back for 11 years. (In fact the last time I was there I met my wife because we happened to be sitting next to one another in what I can only assume was the country’s slowest Internet cafe. Just think how many relationships will be killed by broadband.)

Anyways, I will be working furiously on a project but I do like to eat good food, and restaurant recommendations are welcome. I’m not so interested in what I can eat in New York, so the best Thai restaurant or burger is not my object. But the Kenyan, Ethiopian, and Indian food in New York stinks (at least in Manhattan) so those kinds of recommendations are welcome. I have a soft spot for Choma. But please do not tell me to go to Carnivore unless something has changed dramatically in the last decade.

IPA’s weekly links

Guest Post by Jeff Mosenkis of Innovations for Poverty Action.

via National Geographic
via National Geographic


  • National Geographic had a master taxidermist come up with a fake elephant tusk with an embedded GPS tracker so they could track the route of the illegal ivory trade live via satellite through the DRC, LRA territory, and Darfur. Article here, interactive map here, & NPR Fresh Air radio interview here.
  • New McKenzie paper on a microenterprise randomized randomized business plan/grant competition in Nigeria (PDF). Out of 24,000 entries, winners went on to grow significantly more, mostly due to the $50,000 grant enabling investment.
  • 538 has an article about p-values, replications and science with a fun live feature allowing you to p-hack “your way to scientific glory” proving conclusively that Republicans [or Democrats] are better [or worse] for the economy by adding or subtracting variables till you get to p<.05. The article’s point is that science is supposed to be iterative and media/public expectations about finality on any issue are unrealistic. And a lesson in realistic expectations from a group of biostatisticians writing up methods guidelines for other researchers:

“We had to go back about 17 [of our own] papers before we found one without an error”


  • great interview from the Center for Global Development podcast. In 20 minutes, Karthik Muralidharan manages to explain why large numbers of kids in the developing world drop out and how to fix it. The problems is curricula are designed for the top 20% of students, and those who can’t keep up get left behind. His one piece of advice to policymakers – focus relentlessly on getting kids basic reading and math skills by grade 2 so they’ll be able to benefit for the coming years.

And from the globalization file- Liam Murphy, an engineer form Ireland, was visiting Abu Dhabi and took a cab to Ferrari World theme park. When he found out his cab driver, an Indian guest worker, was going to wait outside for the day & had never been to a theme park the engineer paid for him to come along. So here’s your photo of an Indian migrant worker & Irish engineer on an Italian-themed roller coaster in an emirate.






IPA’s weekly links

Guest post by Jeff Mosenkis of Innovations for Poverty Action.


And SMBC comics explains economists’ plan to fix the world in one cartoon.

IPA’s weekly links

Guest post by Jeff Mosenkis of Innovations for Poverty Action.


  • Worm Wars update:
    • Michael Clemens and Justin Sandefur have the best explainer so far.
    • Esther Duflo and Dean Karlan (for J-PAL and IPA) summarize what the debate means for open science, and would like an independent researcher to re-appraise the data.
    • Which will be easier now that the authors of the re-analysis have posted their work online.
    • And walking public good David Evans has collected everything about the debate here.
  • A study in PLOS ONE finds when the government started requiring pre-analysis plans for studies on cardiovascular drugs, success rate went from 57% to 8%. (h/t Sandefur/Lanthorn)
  • And The Berkeley Initiative for Transparency In the Social Sciences has a prize for reproducible/open social science research (with a good blog post here).
  • NPR has the story of an MIT student who learned in class about Dupas’ RCT showing that informing teen girls of HIV risk with older “sugar daddies” reduces rates of teen pregnancies, and started a non-profit in Botswana to do just that.

    “Out of hundreds of studies, here was one of the few that had that big an impact,” says Angrist, “and it sat year after year after year accumulating dust on the library shelf.”

    “I thought, ‘This is my chance to turn research into action.’ “

Cash transfers also lower rates of teen pregnancy, but there are some statistics you can’t argue with. (h/t Max Roser)


This early childhood program increased voting by 40% twenty years later

I find a strong relationship between the non-cognitive skills of grit, self control, behavioral control, and social skills measured in childhood and political participation in adulthood. This strong relationship holds even when considering measures of cognitive ability and other potential confounders.

Simply put, those children who develop non-cognitive skills are more likely to participate in adulthood. Going one step further, I test whether exogenous improvements in non-cognitive skills during early childhood translate into participation increases in adulthood. To do so, I use a unique 20-year, multi-site field experiment—the Fast Track intervention…

I show that this early-childhood field experiment targeted|and successfully moved|students’ non-cognitive skills, while leaving their cognitive skills and other factors relevant to political participation virtually unchanged…

Exposure to this program increased turnout among participants 11-14 percentage points—a substantial amount, constituting at least a 40% increase in baseline participation rates…

It appears that Fast Track mobilized because it taught children to regulate their thoughts, behaviors, and emotions and use these abilities to integrate in society.

A new paper by John Holbein. Unfortunately I did not see data on party affiliation.

In related news, my colleagues Julian Jamison and Dean Karlan produce a worthy contender for an Ig-Nobel prize for their Halloween candy voting experiments with children.

We decorated one side of a house porch with McCain material in 2008 (Romney material in 2012) and the other side with Obama material. Children were asked to choose a side, with half receiving the same candy on either side and half receiving more candy to go to the McCain/Romney side. This yields a “candy elasticity” of children’s political support.

Worm wars continued (but not by me)

Michael Clemens and Justin Sandfur at CGD weigh in:

Suppose a chemistry lab claimed that when it mixed two chemicals, the mixture rose in temperature by 60 degrees. Later, a replication team reviewed the original calculations, found an error, and observed that the increase in temperature was only 40 degrees.

It would be strictly correct for the replication team to announce, “We fail to replicate the original finding of 60 degrees.” That’s a true statement by itself, and it doesn’t fall within the strict purview of a pure replication to do additional tests to see whether the mix rose by 30 degrees, or 40 degrees, or whatever.

But it in this situation it would be excessive to claim that replication “debunks the finding of a rise in temperature,” because the temperature certainly did rise, by a somewhat different amount. This is basically what’s happened with the deworming replication, as we’ll explain.

I haven’t seen many non-economist responses, other than Stéphane Helleringer’s comments on this blog. Have I missed them or they don’t exist?

IPA’s weekly links

Guest post by Jeff Mosenkis of Innovations for Poverty Action.

  • With the debate over deworming in danger of overtaking actual worms in terms of lost productivity, a reminder that much arguing about what analysis methods were chosen can be solved with a pre-analysis plan. The Journal of Economic Perspectives has two helpful articles:
    • Olken’s Promises and Perils of Pre-analysis Plans goes over a checklist of advantages, such as allowing for choosing unconventional tests without accusations of cherry picking, and tips for compromises, like making some of the initial data cleaning choices “blinded” (without separating treatment/control).
    • Coffman and Niederle argue Pre-analysis Plans Have Limited Upside, Especially Where Replications Are Feasible, and the problem is really that there’s little incentive for academic researchers to replicate.
    • One thought: A stats prof of mine who was a former physicist in his 80’s (at least), used to say back when stats were done by hand or after waiting days for a turn at the university’s basement-sized mainframe computer, people chose their stats tests and planned their analysis far in advance and much more carefully. When there was a “cost” to each analysis, the process was slower & more deliberative – essentially right in between “pre-analysis plan before study starts” and “test as you go.”

And Nigerian lawyer and satirist Elnathan John offers the Gospel According to Aid.

This tool will help you engineer the results you want from randomized trials!

Clinical drug trials are conducted by pharmaceutical firms to establish the effectiveness and safety of new treatments, but failure to publish the results of all trials is skewing medical science (as highlighted in our story from this week’s Science section). Using our interactive simulator, run a series of clinical trials for yourself and discover how to play the system by publicising results in favour of your own product

The Economist has a clinical trials simulator. Hat tip Ben Goldacre.

What if Spiderman were black, and Uncle Ben was shot by police?

I keep thinking how much more powerful the Spiderman origin story would be if Peter Parker was an African American kid, whose Uncle Ben was shot by police while being arrested for a minor parking infraction. There is no formal investigation, and Peter decides to put himself on the line to prevent it happening again. He tackles the white crimes that go unpunished, punishes POC criminals fairly. He is the leveler, always fighting to be without bias, to be just. To protect people like his uncle.

This not only mirrors so much of what’s happening in America, but feeds right into the complex relationship between Spiderman, the authorities and the media.

Peter Parker is a brilliant student, awkward, a nerd, but is branded a thug, a gang member, a criminal, because of his appearance. The media latch on to that and misrepresent him totally.

The police, humilitated by the fact that he refuses to work with them and often punishes cops themselves for brutalizing innocent people, or guilty people who still deserve better treatment than they get, attempt to hunt him down.


Best comment: “J. Jonah Jameson’s attitude would be remarkably unchanged in this scenario.”

Hat tip to Suresh Naidu.

The 10 things I learned in the trenches of the Worm Wars

If you have no idea what I’m talking about, either count yourself lucky or see yesterday’s post. The rest of you, carry on.

  1. One of the things I love about the Internet is that it brought a lot of very smart people to a key intellectual problem, the discussion brought out the central assumptions and claims, and they were answered within about a day or two. See Berk Ozler, for example. My conclusion is that the Kenya deworming results are relatively robust. Yay hive mind.
  2. On the other hand, the hive mind has a tendency to be grumpy, rude, shrill and angry. I found the debate more dignified than some, but vicious at times.
  3. I am guilty as well. I was too quick to suspect and insinuate bad faith on the part of the replicators. I can hold suspicions but I shouldn’t publicly insinuate or accuse without grounds. I am sorry for that.
  4. I do find any big, coordinated media push of a scientific finding to be problematic, to say the least. This drove and drives my suspicion of bias, even if accidental.
  5. To me the big failure in this entire business was by the editor of the academic journal. The competing claims on whether or not the results are fragile or not, and why, should never have been allowed to remain ambiguous.
  6. To me, a glitzy media push undermined the credibility and intentions of the journal further. This is a general problem in medicine and hard science that I do not see as much in social science (where the journals could care less about news coverage).
  7. On the journalist side, I can’t blame any of the writers for not following the finer statistical points. I had trouble myself. But almost none of the journalists read the reply by Miguel and Kremer (published by the same journal) and maybe none called Miguel or Kremer on the phone. I am told I was the first. Tell me if I’m wrong, but isn’t this the definition of sloppy journalism?
  8. I think the deworm the world movement has also tended to exaggerate or selectively quote the evidence in order to justify the cause. GiveWell has a much more balanced account: the evidence is not that great, but it’s good enough. Sort of. I thought GiveWell had one of the best posts. Do read it.
  9. To me, the real tragedy is that, 18 years after the Kenya deworming experiment (which was not even a real experiment) we do not have large-scale, randomized, multi-country, long-term evidence on the health, education, and labor market impacts of deworming medicine. This is not some schmuck cause. This is touted as one of the most promising development interventions in human history.
  10. I also fear for the reputation of replications in development economics. I imagine some of the problems could be addressed by getting more clarity into pre-analysis plans for replications. But the incentives to make mountains out of molehills is huge. Maybe everyone should sign a “no glitzy media push” pledge.

Okay, I am DONE with worms.

IPA’s weekly links

Guest post by Jeff Mosenkis of Innovations for Poverty Action.

Our apologies, the links are a bit late, but you’ll never believe what happened this week:

  • So Ted Miguel …. was on NPR’s All Things Considered, talking about why plans to use small solar panels to power Africa isn’t the answer, when most Kenyans already live right under power lines.
  • And in open science/replication news – some researchers at Berkeley shared their data publicly – and another research team beat them to publication. (Follow up from the researchers.)
  • Also, in a respected British source … Survey finds 60% of problems in replications can be solved if research teams would just talk to one another.
  • Seriously, new replications of the classic deworming findings reaffirmed some conclusions, but also called some into question, along with the Cochrane collaboration. Then twitter exploded. Discussion on Chris’ blog as well as GiveWell’s independent analysis both suggest unchanged policy and investment recommendations.
    • Other views from the Guardian, Vox, and Buzzfeed, original authors (PDF), and Berk Ozler. (It’s worth noting when reading general audience reporting on this, how statistical terms like “error” and “bias” with dual meaning can cloud the discussion when repeated). 
  • We’ve reported on the new Chinese-led development bank, there’s now a newer BRICS (Brazil, Russia, India, China and South Africa) development bank. (Literally, it’s called the New Development Bank).
  • The RISE program is accepting proposals for education research through August 1:

    RISE is a new multi-country research programme that aims to build an understanding of education systems and how they can be transformed to accelerate learning for all.


Dear journalists and policymakers: What you need to know about the Worm Wars

One of my favorite science writers, Ben Goldacre, enters the so-called Worm Wars. He’s not alone, with a flurry of new articles today. The question is simple: is a deworming pill that costs just a few cents one of the most potent anti-poverty interventions of our time?

Below is the picture from Goldacre’s post. I assume Buzzfeed editors chose it. It’s a nice illustration that nothing you will read in this debate is dispassionate. Everyone wants one thing: your clicks (and retweets, and likes, and citations). Most writers sincerely want the truth too. Sadly the two are not always compatible.

In brief: Ted Miguel and Michael Kremer are Berkeley and Harvard economists who ran the original deworming study that showed big effects of the medicine on school attendance in Kenya—one of the few to attempt to measure such impacts. That study ignited the impact evaluation movement in international development, especially through their students (like me). It also ignited a movement to deworm the world. This is a big claim, worth investigating. Calum Davey led the team who did a replication.

I know this study. In fact, as a first year graduate student I spent a summer working for Miguel and Kremer designing their long term follow up survey. Relationships are incestuous on all sides of the deworming debate, so you can hardly call me an impartial judge. Nonetheless, bear with me as I try.

I haven’t paid much attention to the deworming world for more than a decade. So I spent last night and this morning reading as much as I could. There’s an overwhelming amount to process, but I’ve drawn a few early conclusions.

The bottom line is this: both sides exaggerate, but the errors and issues with the replication seem so great that it looks to me more like attention-seeking than dispassionate science. I was never convinced that we should deworm the world. There are clearly serious problems with the Miguel-Kremer study. But, to be quite frank, you have throw so much crazy sh*t at Miguel-Kremer to make the result go away that I believe the result even more than when I started. Continue reading

Statistician Neal Beck just justified my longstanding hatred and loathing of logit

Neal is probably horrified by my slightly inaccurate title, but we all know this blog ain’t the New York Times.

In 2010 I was on sabbatical at NYU’s political science department. Neal asked me why I always used ordinary least squares regressions (OLS) when my dependent variable was a 1 or 0. Why not logit instead of this linear probability model? He had seen economists do this before, and was surprised at my reply—that it had become general practice in a lot of applied economics.

I think the reasons economists often go the linear route is because it generates very simple to interpret estimates, which is not the case with logit. And they are basically correct, with logit unlikely to yield a different answers. Clarity wins in my book.

It interested him enough to run some simulations, and what he found didn’t dissuade me from my sloppy clarity. Several years later, I get an email from Neal titled “At last!” It is a paper. The abstract:

This article deals with a very simple issue: if we have grouped data with a binary dependent variable and want to include fixed effects (group specific intercepts) in the specification, is Ordinary Least Squares (OLS) in any way superior to a (conditional) logit form? In particular, what are the consequences of using OLS instead of a fixed effects logit model in respect to the latter dropping all units which show no variability in the dependent variable while the former allows for estimation using all units.

First, we show that the discussion of fixed effects logit (and the incidental parameters problem) is based on an assumption about the kinds of data being studied; for what appears to be the common use of fixed effect models in political science the incidental parameters issue is illusory.

Turning to linear models, we see that OLS yields a perhaps odd linear combination of the estimates for the units with variation in the dependent variable and units without such variation, and so the coefficient estimates must be carefully interpreted.

The article then compares two methods of estimating logit models with fixed effects, and shows that the Chamberlain conditional logit is as good as or better than a logit analysis which simply includes group specific intercepts (even though the conditional logit technique was designed to deal with the incidental parameters problem!).

Related to this, the article discusses the estimation of marginal effects using both OLS and logit. While it appears that a form of logit with fixed effects can be used to estimate marginal effects, this method can be improved by starting with conditional logit and then using the those parameter estimates to constrain the logit with fixed effects model. This method produces estimates of sample average marginal effects that are at least as good as OLS, and better when group size is small. However, this is based on simulations favorable to the logit setup.

So it can be argued that OLS is not “too bad” and so can be used when its use simplifies other matters (such as endogeneity). These issues are simple to understand, but it appears that applied researchers have not always taken note of these issues.

The problem with evidence based policy change is we don’t have evidence on the important policies

Peter Singer has a Boston Review piece telling us we should all be “effective altruists”—to make a difference by giving our time and our money, and giving only to causes that demonstrate their effectiveness through evidence.

There are many good replies, including from Angus Deaton and Daron Acemoglu. Here is Acemoglu:

More evidence is always preferred, but precise measurement of the social value of a donated dollar may be impossible. What is the social value of a dollar given to Amnesty International as opposed to Oxfam or an NGO providing vaccines or textbooks?

…But the problem is thornier still. A large body of research shows that economic development is the best way to lift millions out of poverty and improve their health, education, and access to public amenities. So one has to take into account how charities’ activities affect economic development, which is essentially impossible. If, as some economists and political scientists suggest, changes in political and economic institutions are critical for long-run economic growth, then watchdog organizations such as Amnesty may be essential for transforming dysfunctional regimes. Effective altruists don’t (yet?) see the importance of these more political organizations.

To his critique (and Deaton’s as well): Yes! At the same time, some reservations:

  • It’s hard to overstate how many stupid and dead end causes people give money to. Singer probably sees very rich people giving to idiotic boutique causes all the time. I avoid those people, but I have to contend with the World Bank and others spending billions on things like vocational training, which has basically zero impact. This is another way of saying Singer is right on the margin, Acemoglu is right as we move away from the marginal decision.
  • I’m not worried about “too much” effective altruism. It would be a problem if it happened, but the world won’t even get close. Aid donors and the very rich are (1) stubborn, and (2) don’t read.
  • But Acemoglu is right that institutional and political change are more important and the evidence-based crowd have done very little here. Most of that evidence is about anti-corruption or election monitoring or other things that I doubt change politics very much.
  • Meanwhile all the good political economy research (like Acemoglu’s) has no clear implication for social and political change in the world. There is a big disconnect. These scholars have mostly ignored this gap either because… I don’t know why. Maybe it’s too treacherous or hard, or they don’t find it interesting enough, or they are cynical about policy change. I don’t know. Someone explain it to me.
  • Actually, this is not entirely true. You could view a lot of research says “you should stop violent conflicts, and here are concrete steps to do so”. I can think of few better short-term investments. More work along these lines strikes me as a good thing.

If fixing gender imbalances in academia didn’t seem hard enough already…

We analyze how a larger presence of female evaluators affects committee decision-making using information on 100,000 applications to associate and full professorships in all academic disciplines in two countries, Italy and Spain.

These applications were assessed by 8,000 evaluators who were selected through a random draw. A larger number of women in evaluation committees does not increase either the quantity or the quality of female candidates who qualify. If anything, when evaluators’ are not familiar with candidates’ research area, gender-mixed committees tend to be less favorable towards female candidates than all-male committees, with the exception of evaluations to full professorships in Spain.

Data from 300,000 individual voting reports suggests that men become less favorable towards female candidates as soon as a woman joins the committee.

Article. You may be thinking, “oh Southern Europeans are not like us” but I am not so sure.

Then again, my colleague Bob Erikson finds that female judges on US appellate courts influence the votes of male judges to be more liberal on sex discrimination cases.

Comics guru Alan Moore eats his young

To my mind, this embracing of what were unambiguously children’s characters at their mid-20th century inception seems to indicate a retreat from the admittedly overwhelming complexities of modern existence…a

It looks to me very much like a significant section of the public, having given up on attempting to understand the reality they are actually living in, have instead reasoned that they might at least be able to comprehend the sprawling, meaningless, but at-least-still-finite ‘universes’ presented by DC or Marvel Comics.

I would also observe that it is, potentially, culturally catastrophic to have the ephemera of a previous century squatting possessively on the cultural stage and refusing to allow this surely unprecedented era to develop a culture of its own, relevant and sufficient to its times.

Article. It’s an interesting interview.

I personally enjoy a good superhero movie. Presumably the entertainment industry will always find a way to serve people their childhoods twenty to thirty years later. This is normal. Even My Little Pony is back in force.

A healthy art will do more than this. While serving the kitsch, it will push the frontier of what’s possible. I think science fiction has been doing this. Fantasy not so much. Comics I can’t say, since I haven’t been paying attention. Readers? Is there an artistic frontier?