Do the Millenium Villages work?

Short answer: we have no idea. But I’m hoping that northern Uganda might be the first real test case.

Millennium Villages are springing up in dozens of poor countries. While the package of health, education, agriculture and other interventions are undoubtedly helping, we have little sense of whether they represent the most effective use of donor and national government resources, whether they can be scaled up to a national level, or what the distribution of benefits are within a village (i.e. the equity of the program).

We also have little sense whether the central hypothesis underlying the MVs–that the whole of the intervention is greater than the sum of its parts–is correct. The ‘big push’ idea of development suggests that an educational intervention is limited by health or other constraints, and that a program to lift multiple constraints at the same time will have disproportionate impacts. This could be one of the biggest and most important hypotheses waiting to be examined in all of development.

I have been discussing the possibility of evaluating the impact of a new MV program in northern Uganda with UNDP and the Ugandan government. Here is an excerpt of what I had to say to them.

Most evaluations are essentially interested in arriving at a return on investment figure—that is, what is the impact of spending X dollars on program A. I think this is important information, but I also believe that implementing agencies will be better served by also learning how to do program A better given X dollars, or the relative impact of spending X dollars on Program A versus B, C or D.

In essence, the nature of the evaluation depends on what you want to learn.

First, if you want to learn the impact of the MV program overall, the evaluation strategy should compare MV villages to a comparison group of villages that did not receive the MV intervention. For example, I am working with an NGO in Acholiland to implement community development programs, and we are comparing individuals and communities that receive the development grants to those that do not. In the examples I gave above, this is the type of evaluation that give you the straight return on investing in “Program A”.

Second, if you want to learn the added or incremental impact of a particular MV intervention (such as psychosocial programming, or a particular health intervention) then the evaluation strategy would be to compare villages that receive the full MV program to villages that receive the MV program minus the added or experimental intervention. For example, in the evaluation I mentioned above, some of the villages are receiving group dynamics training and added facilitation to try to make projects more participatory. We will compare the effectiveness of communities that receive the added service to those that receive the basic program. This is the type of evaluation that tells you the impact of A versus B, C or D.

Alternatively, you might be interested in understanding what process works best, rather than which programs. For example, you might be interested in what type of governance strategy or set of incentives and controls works best to ensure efficient implementation and good governance of the MVs. For example, I am working with NUSAF [a northern Uganda development agency] to evaluate 300 youth groups that are receiving funds for vocational training. One third of the groups will receive the regular NUSAF program, one third will receive additional funds to hire a community facilitator to help them monitor and manage their projects, and in one third of cases the district NUSAF office will receive an incentive to follow-up the groups and help them facilitate and manage the process. We will thus be able to assess the effectiveness of alternative schemes to improve the management and governance of the program. This is the type of evaluation that tells you how to do Program A better.

Third and final, you may be interested in evaluating the basic idea underlying the MV projects: that the whole is greater than the sum of the parts. The MV project, as you know, is predicated on the idea that a single intervention cannot create development, and that it is the simultaneous elimination of health, education, agricultural, and other barriers that really matter—a ‘big push’. This is probably the most important hypothesis underlying the MV project, and also the most difficult to evaluate. What would be required is to have multiple comparison groups of villages: villages that receive the full MV package of interventions, as well as groups of villages that receive different combinations of the package, ranging from a few interventions to many. This evaluation strategy integrates all of the strategies mentioned above. It is very ambitious, but also very important. It could be done, assuming that the number of villages to work with were large enough and the political will is present.

These are the three major evaluation approaches I would suggest. In each of the three we define and survey the program groups as well as the comparison group—to use the scientific terms, the ‘treatment’ and ‘control’ groups. Surveys are performed before the program and afterwards on both groups.

Evaluation need not be expensive or very cumbersome. The critical thing is to get the evaluation strategy implemented properly—that is, to identify and maintain treatment and control groups. There are some other technical challenges in implementing such an evaluation, but we can discuss them in more detail once we settle on the approach of most interest.

I should mention that, in each of the examples that I gave above, the program and comparison villages (i.e. the treatment and control groups) were selected randomly. That is, we began with a list of eligible villages or groups, and we selected the
beneficiaries via a lottery. This went over well with the communities (most were disappointed , but were fully informed and were happy with the transparency of the process). The advantage of the lottery approach is that the treatment and control villages are essentially identical. An alternative approach would be to pre-identify MV sites and then use nearby communities for the comparison group. This approach is less than ideal, however, since we will never know whether any differences are due to the MV program or pre-existing differences. Nevertheless, it is a workable option if the lottery approach is not feasible. I would strongly encourage the lottery approach, however.

Let me stress that the usual approach to evaluation will NOT work in the MV case. What is too often done is to perform a baseline and a follow-up survey of the beneficiaries, and a comparison made. Any change is attributed to the program, even though the change represents the impact of the program plus all the other events (natural disasters, population growth, other programs, etc) that occurred in the interim.

Moreover, the relevant comparison is not zero growth or change, but the change and development that would have occurred in the absence of the MV interventions. The use of a non-random comparison group can give us a rough approximation of this change, and help us to isolate the impact of the MV. The random assignment approach is far superior, however. Moreover, in the case where we are evaluating the incremental impact of added interventions, or the effect of the combination of many interventions, it is easy and uncontroversial to implement.

Whatever choice is made, in my experience the principle challenge is often a political and strategic one, while the operational challenges are secondary. Thus far the MV organization in New York and the majority of implementing governments appear to have been uninterested in or unwilling to monitor and survey comparison villages (random or otherwise). In my experience the decision to evaluate and have control villages is uncontroversial at the village level. We have had no trouble in the 25 northern districts in which we have been conducting evaluations, for example. My understanding is that there is some ideological opposition to the idea of control groups or to evaluation of any sort. In some instances this is understandable. In the case of the MVs, they are simply too important not to evaluate.

The MVs could present one of the greatest evaluation challenges to date. Two major statistical challenges will need to be overcome: the need for an adequate number of villages to be able to test alternative combinations of the package; and the possibility of spillover effects from treatment villages to control ones that lead us to underestimate the impact of the MVs.

The other problem with evaluating a single MV exercise is simple: is the impact generalizable? That is, if we find an impact in Uganda, can we generalize the result to any other region on earth? Is any impact an artifact of local conditions?

Finally, here is what seem to me to be the most interesting but most challenging proposition: it’s all about the governance. In some sense, an MV is simply a place where the government works–farmers get extension services, the health centers stay open, the teachers show up to work. There are not a lot of white people running around. There is simply an effective state with the resources and organization to carry out their jobs.

The MV dialogue is all about the resources. I am all about the governance. The Ugandan state works when it wants to.

What if the Ugandan MV works because the national and District governments are conscious that the whole world is looking, and so the best extension officers get assigned, or the best officials are appointed to oversee the effort? Probably more effective interventions are the result.

What is the effect, as a head teacher or health center manager, of knowing that on any given day there is a chance that some visitor or delegation might pop up to check out the place? Probably you show up to work more often.

This is the most difficult aspect to evaluate, but potentially the most interesting. If true, the MVs may be inherently inscalable without an revolution in governance. If false, the MVs might be much more scalable and promising than expected. But how to evaluate such a thing? To be considered in future posts…

Comments are closed.