Can we use data and machine learning to predict local violence in fragile states? As it turns out, yes.

October 2, 2014

After years of working on program evaluation and related things, it is with great joy that I toss causation out the window and learn to data mine.

A few years ago, a foundation said to me, “hey, all that data you’re collecting to study property disputes and other violence in Liberia–could you use it to test early warning systems for riots and major crimes?” My reaction: “That sounds crazy. As if that’s possible.” Their response, “We will fund your survey if you try.” My reply: “Did I say crazy? I meant that sounds like a great idea.”

After six years of data collection, Rob Blair and Alex Hartman and I finally have a paper:

We use forecasting models and new data from 242 Liberian communities to show that it is to possible to predict outbreaks of local violence with high sensitivity and moderate accuracy, even with limited data.

We train our models to predict communal and criminal violence in 2010 using risk factors measured in 2008. We compare predictions to actual violence in 2012 and find that up to 88% of all violence is correctly predicted. True positives come at the cost of many false positives, giving overall accuracy between 33% and 50%.

From a policy perspective, states, international organizations, and peacekeepers could use such predictions to better prevent and respond to violence. The models also generate new stylized facts for theory to explain.

In this instance, the strongest predictors of more violence are social (mainly ethnic) cleavages, and minority group power-sharing.

This is not precisely “big data” in that it’s a small number of villages and three years of events. But it’s “big” in the sense of having lots and lots of detailed information about the villages themselves, which is rare. We think of this as a pilot, or proof of concept for the approach, and plan to test it next on much bigger data from other countries.

The most interesting finding, to me, was how power-sharing at the local level was associated with more violence. There’s actually a number of papers looking at national power-sharing right now that find the same thing. And yet the common political response to a crisis nowadays is to push for power-sharing. Worth investigating.

I would have liked to name this paper “I just ran 32 million regressions,” but besides other drawbacks, the more honest title would be “My RA just ran 32 million regressions,” which is slightly less compelling.

58 Responses

Chris Blattman says:

October 15, 2014 at 1:04 pm

@Luke and @RDub2:

Thanks. @Rdub2: We’re somewhat novices. We know the limitations of the statistics we use but aren’t sure what else we can add to fill out the picture. Suggestions?

@Luke:
1) You are right and we clarified what we meant in the new version.
2) The robustness tables at the back show both. We’ve done more robustness. Mostly the results are stable, though least of all with RF if I recall.
3) Would like more detail. I’m not familiar enough with NN to think about how this would be a worry with k-fold cross validation and the forward looking forecast.
4) No good reason other than the simplicity of lasso appealed to us and we had to limit our models somehow.
RDub2 says:

October 13, 2014 at 7:41 pm

As I’m sure you are aware True Positives/Negatives are not always the best way to measure a models accuracy and is highly dependent upon category size. Thanks for posting.
Luke says:

October 13, 2014 at 5:02 pm

Interesting article. Couple of questions:
1) Why say that the models apart from neural network are not interactive or non-linear. Random forest is highly interactive and non-linear.
2) Did you experiment with the number of trees in the random forest or with depth?
3) You might want to also mention that neural networks are more prone to overfitting…
4) Why lasso instead of elastic-net?
Sebastian says:

October 12, 2014 at 12:00 pm

Dear Chris, I was wondering which language(s) and package(s) did you use for analyzing your data?
glassmanamanda says:

October 6, 2014 at 2:16 pm

RT @cblatts: Can we use data and machine learning to predict local violence in fragile states? As it turns out, yes. http://t.co/43AfWlrdz8
sehau_ says:

October 6, 2014 at 1:17 pm

RT @logtrust: Can we use #data and #machinelearning to predict local violence in fragile states? As it turns out, yes. http://t.co/FjRFp46y…
akawombat42 says:

October 6, 2014 at 11:33 am

RT @logtrust: Can we use #data and #machinelearning to predict local violence in fragile states? As it turns out, yes. http://t.co/FjRFp46y…
joshelberg says:

October 6, 2014 at 11:05 am

RT @logtrust: Can we use #data and #machinelearning to predict local violence in fragile states? As it turns out, yes. http://t.co/FjRFp46y…
logtrust says:

October 6, 2014 at 11:02 am

Can we use #data and #machinelearning to predict local violence in fragile states? As it turns out, yes. http://t.co/FjRFp46ye7 #tech
machlearnbot says:

October 6, 2014 at 10:59 am

Can we use data and machine learning to predict local violence in fragile states? As it turns out, yes http://t.co/d17Xc7ZbDL
timolue says:

October 6, 2014 at 10:10 am

“Can we use data and machine learning to predict local violence in fragile states? As it turns out, yes.” http://t.co/JqMlHZOrhh
jauntyallegory says:

October 5, 2014 at 4:51 pm

RT @cblatts: Can we use data and machine learning to predict local violence in fragile states? As it turns out, yes. http://t.co/43AfWlrdz8
adwulet88 says:

October 5, 2014 at 11:35 am

RT @cblatts: Can we use data and machine learning to predict local violence in fragile states? As it turns out, yes. http://t.co/43AfWlrdz8
Andy says:

October 5, 2014 at 10:40 am

Is this a secret battle between you and Sala-i-Martin at Columbia?

http://www.jstor.org/discover/10.2307/2950909?uid=3739832&uid=2&uid=4&uid=3739256&sid=21104277908541

http://www.nber.org/papers/w6252
singhalecon says:

October 5, 2014 at 12:41 am

RT @cblatts: Can we use data and machine learning to predict local violence in fragile states? As it turns out, yes. http://t.co/43AfWlrdz8
sehau_ says:

October 4, 2014 at 4:54 am

RT @zajacsannerholm: Can we use data and machine learning to predict local violence in fragile states? As it turns out, yes. http://t.co/x…
zajacsannerholm says:

October 4, 2014 at 1:09 am

Can we use data and machine learning to predict local violence in fragile states? As it turns out, yes. http://t.co/xxOZRUVfHu
djroomba419 says:

October 3, 2014 at 11:57 pm

RT @treycausey: Can we use data and machine learning to predict local violence in fragile states? As it turns out, yes. http://t.co/bN1iVy…
machlearnbot says:

October 3, 2014 at 10:18 pm

Can we use data and machine learning to predict local violence in fragile states? As it turns out, yes http://t.co/OAM3zSHQRu
Ian_Larsen says:

October 3, 2014 at 8:39 pm

RT @SIMLab: Data and machine learning can predict local violence in fragile states. http://t.co/AAOGAN4KD8 #EWS #ICT
LannyMorrow says:

October 3, 2014 at 8:35 pm

Can we use data and machine learning to predict local violence in fragile states? As it turns out, yes. http://t.co/7gdDvAhiNE
Keen_T0 says:

October 3, 2014 at 4:37 pm

Can we use data and machine learning to predict local violence in fragile states? As it turns out, yes. http://t.co/DyTSjSOiTS
DevIntern says:

October 3, 2014 at 2:58 pm

RT @BottmUpThinking: Another tech leapfrog moment for #globaldev? Is @cblatts helping make #minorityreport a reality in Liberia? http://t.c…
SIMLab says:

October 3, 2014 at 2:22 pm

Data and machine learning can predict local violence in fragile states. http://t.co/AAOGAN4KD8 #EWS #ICT
auing3r says:

October 3, 2014 at 1:41 pm

RT @Markus_Ellmer: „Can we use data and machine learning to predict local violence in fragile states? As it turns out, yes.“ @cblatts http…
Markus_Ellmer says:

October 3, 2014 at 1:36 pm

„Can we use data and machine learning to predict local violence in fragile states? As it turns out, yes.“ @cblatts http://t.co/HcysmeXWkb
Andy says:

October 3, 2014 at 1:07 pm

Just in time. Now we can turn to the deleterious impact of Ebola on economic fabric of the country
hogleca says:

October 3, 2014 at 12:46 pm

RT @jeneambrose: In which @cblatts runs 32 million regressions & determines it’s possible to predict outbreaks of local violence. http://t.…
machlearnbot says:

October 3, 2014 at 11:57 am

Can we use data and machine learning to predict local violence in fragile states? As it turns out, yes http://t.co/Squ5g3DIYO
wjguardado says:

October 3, 2014 at 11:38 am

Can we use #data and machine learning to predict local #violence in #fragile states? As it turns out, yes. http://t.co/TBaOHZ9wqd
QoGData says:

October 3, 2014 at 5:20 am

RT @cblatts: Can we use data and machine learning to predict local violence in fragile states? As it turns out, yes. http://t.co/43AfWlrdz8
JanetAdama says:

October 3, 2014 at 3:31 am

RT @cblatts: Can we use data and machine learning to predict local violence in fragile states? As it turns out, yes. http://t.co/43AfWlrdz8
smriti106 says:

October 3, 2014 at 12:23 am

RT @cblatts: Can we use data and machine learning to predict local violence in fragile states? As it turns out, yes. http://t.co/43AfWlrdz8
roryeakin says:

October 3, 2014 at 12:03 am

RT @nancymbirdsall: Yup. Consider new “national unity” govt in Afghanistan RT @cblatts: Machine learning predicts violence in Liberia http:…
kgcentral says:

October 2, 2014 at 11:54 pm

RT @adrianflorea13: Can we use data and machine learning to predict local violence in fragile states? As it turns out, yes. http://t.co/XR…
beanaling says:

October 2, 2014 at 10:24 pm

RT @cblatts: Machine learning predicts violence in Liberia http://t.co/a3nphOdwWz
JspencerUNC says:

October 2, 2014 at 10:21 pm

RT @cblatts: Can we use data and machine learning to predict local violence in fragile states? As it turns out, yes. http://t.co/43AfWlrdz8
helleringer143 says:

October 2, 2014 at 10:14 pm

RT @cblatts: Can we use data and machine learning to predict local violence in fragile states? As it turns out, yes. http://t.co/43AfWlrdz8
misterjekil says:

October 2, 2014 at 8:19 pm

RT @cblatts: Can we use data and machine learning to predict local violence in fragile states? As it turns out, yes. http://t.co/43AfWlrdz8
avandemo says:

October 2, 2014 at 8:05 pm

RT @cblatts: Can we use data and machine learning to predict local violence in fragile states? As it turns out, yes. http://t.co/43AfWlrdz8
JonesZM says:

October 2, 2014 at 7:38 pm

RT @cblatts: Can we use data and machine learning to predict local violence in fragile states? As it turns out, yes. http://t.co/43AfWlrdz8
adamd_hill says:

October 2, 2014 at 7:06 pm

RT @cblatts: Can we use data and machine learning to predict local violence in fragile states? As it turns out, yes. http://t.co/43AfWlrdz8
FanVictoria says:

October 2, 2014 at 6:58 pm

RT @cblatts: Can we use data and machine learning to predict local violence in fragile states? As it turns out, yes. http://t.co/43AfWlrdz8
Leo_Benjamin says:

October 2, 2014 at 6:13 pm

RT @cblatts: Can we use data and machine learning to predict local violence in fragile states? As it turns out, yes. http://t.co/43AfWlrdz8
alexhnoyes says:

October 2, 2014 at 5:51 pm

RT @cblatts: Can we use data and machine learning to predict local violence in fragile states? As it turns out, yes. http://t.co/43AfWlrdz8
kingsepp says:

October 2, 2014 at 5:40 pm

RT @cblatts: Can we use data and machine learning to predict local violence in fragile states? As it turns out, yes. http://t.co/43AfWlrdz8
ClaireAdida says:

October 2, 2014 at 5:39 pm

RT @cblatts: Can we use data and machine learning to predict local violence in fragile states? As it turns out, yes. http://t.co/43AfWlrdz8
Dan_E_Solo says:

October 2, 2014 at 5:32 pm

@cblatts This was great–it’s rare to see data on a “peaceful” situation. I’d love to see a mixed-methods study on the power-sharing bit.
Dan_E_Solo says:

October 2, 2014 at 5:29 pm

RT @cblatts: Can we use data and machine learning to predict local violence in fragile states? As it turns out, yes. http://t.co/43AfWlrdz8
AndyDCarl says:

October 2, 2014 at 5:29 pm

RT @cblatts: Can we use data and machine learning to predict local violence in fragile states? As it turns out, yes. http://t.co/43AfWlrdz8
AJKnuppe says:

October 2, 2014 at 5:24 pm

RT @cblatts: Can we use data and machine learning to predict local violence in fragile states? As it turns out, yes. http://t.co/43AfWlrdz8
foxyforecaster says:

October 2, 2014 at 4:30 pm

RT @cblatts: Can we use data and machine learning to predict local violence in fragile states? As it turns out, yes. http://t.co/43AfWlrdz8
teresicchi says:

October 2, 2014 at 4:09 pm

RT @cblatts: Can we use data and machine learning to predict local violence in fragile states? As it turns out, yes. http://t.co/43AfWlrdz8
TonysAngle says:

October 2, 2014 at 3:47 pm

RT @cblatts: Can we use data and machine learning to predict local violence in fragile states? As it turns out, yes. http://t.co/43AfWlrdz8
HetanShah says:

October 2, 2014 at 3:44 pm

RT @cblatts: Can we use data and machine learning to predict local violence in fragile states? As it turns out, yes. http://t.co/43AfWlrdz8
felixhaass says:

October 2, 2014 at 3:44 pm

RT @cblatts: Can we use data and machine learning to predict local violence in fragile states? As it turns out, yes. http://t.co/43AfWlrdz8
AABoyles says:

October 2, 2014 at 3:42 pm

RT @cblatts: Can we use data and machine learning to predict local violence in fragile states? As it turns out, yes. http://t.co/43AfWlrdz8
dtchimp says:

October 2, 2014 at 3:42 pm

RT @cblatts: Can we use data and machine learning to predict local violence in fragile states? As it turns out, yes. http://t.co/43AfWlrdz8

Chris Blattman

Chris Blattman

Can we use data and machine learning to predict local violence in fragile states? As it turns out, yes.

Related

58 Responses

Subscribe to Blog

Recent Posts

Presentation to the Joint Chiefs Operations Directorate

From street fights to world wars: What gang violence can teach us about conflict

When is War Justified?

Conversation with Teny Gross on Gang Violence

The 5 reasons wars happen

Advanced Master’s & PhDs