Deprecated: Hook custom_css_loaded is deprecated since version jetpack-13.5! Use WordPress Custom CSS instead. Jetpack no longer supports Custom CSS. Read the documentation to learn how to apply custom styles to your site: in /var/www/wp-includes/functions.php on line 6078
Quantitative versus qualitative measurement, the contest - Chris Blattman

Chris Blattman

Close this search box.

Quantitative versus qualitative measurement, the contest

If you run a survey of drug use, prostitution, domestic violence, rioting, or crime, who would believe this self-reported data? No one.

If you work in one of the handful countries with reliable, available data, then you might be able to use police or hospital records. I’m looking at you, American scholars, you lucky bastards.

If you’re working in the largely evidence-free zone that is poor countries–especially fragile states–then you have to get creative. Some of my favorites: Alex Scacco randomized whether she interviewed potential Nigerian rioters behind a screen or not. After running an ethnic reconciliation program, Betsy Paluck gave out community gifts in Rwanda and looked at how they were shared out.

In one of my more self-flagellating moments, some colleagues and I decided to start a study of crime and violence-reduction among street youth in Liberia–mostly men who make their living from petty crime and drug dealing (among other things) and lead very risky lives.

The effects of the programs–behavior change therapy and cash transfers–I’ll discuss another time. In brief, a cheap and short program of cognitive therapy seems to have dropped crime, violence, and drug use by huge amounts. And the effects persisted at least a year.

There were two possibilities. One, we’d stumbled upon a miracle cure. Two, they were telling us what we wanted to hear.

What to do? Well, we said, let’s try to measure the measurement error.

I prefer to mix my quant work with serious qual work. Basically, we had two or three truly gifted Liberian research assistants who collected qualitative data full time. Transcripts of conversations in the thousands of pages that I sometimes wonder how we will ever read. We’d visit dozens of the men in the study over and over again across a year to see how their lives changed over time.

These qualitative researchers were already embedded in the communities. So, we thought, why not have them hang out with the survey respondents for four days around the time of their survey? We’d get a general sense of their lives (with full disclosure and consent), but through conversation and observation focus on figuring out the same six things about all of them–some seemingly sensitive behaviors (drug use, petty theft, gambling, and homelessness) plus for balance the use of a few common luxuries, video clubs and paid phone charging services.

The qualitative researchers had no idea what the guys had said on the survey, but just coded their own assessment, usually based on a frank admission of yes or no. Across 4000 surveys, we tried this a random 300 times.

So what happens when you compare a survey question, “did you smoke marijuana in the last two weeks?” to four days of “deep hanging out”? In our case, you basically get the same answer.

The survey and qualitative measures were the same about 75% of the time. When they differed, that difference wasn’t systematic at all. We’d get the same levels of average drug use of stealing from either method. At least for the so-called sensitive behaviors. As it turns out, these particular guys had no reservations at all about talking about crime or drugs. It was such an everyday part of their lives. So, as best we can tell, the survey reports were reasonably right, and the falls in “bad” behaviors real.

What’s interesting is that this didn’t apply to the so-called non-sensitive luxury goods. These the men underreported a little, and almost entirely in the control group. There are a few explanations, but one is that the control group wanted to appear poorer and more deserving of a future program.

The wonkier among you might be wondering: why not use list experiments? The short answer: I don’t 100% believe them, and if you tried testing them on illiterate street youth with short attention spans you’d give up too.

Actually, our deep hanging out wasn’t as hard or as expensive to do as you’d think. Tracking and surveying each person each time cost about $75 on the margin (they were hard to keep track of). The qualitative validation had roughly the same per person variable cost.

We hope more people will try it out. That’s why we not only wrote up the results as a paper, but also have an appendix explaining how we did it in detail. Plus algebra!.

The whole exercise gave us a lot of confidence in our results. Frankly I don’t think any respectable journal would publish the main experiment without this confidence. If you think you’re going to try it, email and we’ll fill you in on more lessons learned and experiences.

89 Responses

  1. the longitudinal aspect sounds like adds great depth too – qual. revisits over a year seems to me as important as 4 days ‘deep hanging out’. my first critical thought re the latter was – maybe you get some people who’ll hide their behaviour while being shadowed. but you seem pretty confident that in general that’s not an issue – the participants don’t see their behaviour as sensitive as we might see it. i guess you already have context over a year (good luck reading those transcripts!), and also it’s unlikely that enough of the 300 could do this?

    doesn’t get round BottumUp’s point though!

  2. Excellent and good to know. I’ve had experience in other (very different) situations where “ecological” questions (as per Peter Dorman above) were critical, so I guess brazenness could be something of a factor in how truthful self-reporting can be.

    There is, though, one other explanation, which may well have occurred to you. That the act of observing itself drives changes, quantum-mechanics-style. If the subjects perceive themselves as somehow special for being selected for this study, might that also affect their changes? I.e. they may be rather more habituated to drugs and crime than being studied, and responding significantly to that. Obviously the control group ought to control for that, but with ethical full disclosure, they may also have understood their different role in the study, and thus responded differently to that stimulus. Obviously very tricky to get out of such a dilemma!

  3. Great work! I love it when hard work yields something simpler, tangible, and inspiring. At the other end of complexity lies simplicity (or so I’ve been told, often by people who themselves have yet to traverse the roads they’re quick to vaunt).

    I’d love it if you’d paste some of your transcripts into the tool and see what, if anything, it reveals to you.

  4. This is so interesting, I tried to do a research about a change in Drug Law in Brazil, and data can be incredibly misleading… A qualitative approach would help interpret the results, and although I have spoken to a few people in the field it is not enough organized to publish. I feel that after working with data pure data is not enough, even if it is a RCT – my personal opinion, though ;)

  5. I think there’s a tendency to overestimate how skittish or dishonest people will be about answering sensitive questions. Ten years ago I did a study of the pay and productivity of child labor in several countries, and, going in, I was terrified that survey methods would simply fail. As a backup, I included ecological questions to employers — “how much do you think other employers pay children to do x?” etc. To my surprise, I encountered no reluctance to answer sensitive questions in any of the survey samples, and the correlation between individual and ecological questions was very high.

    I think researchers should take a chance on responsiveness, at least at the start.

Why We Fight - Book Cover
Subscribe to Blog