That’s the title of a new paper by data guerrillas Dykstra, Dykstra, Sandefur:
Much of the data underlying global poverty and inequality estimates is not in the public domain, but can be accessed in small pieces using the World Bank’s PovcalNet online tool. To overcome these limitations and reproduce this database in a format more useful to researchers, we ran approximately 23 million queries of the World Bank’s web site, accessing only information that was already in the public domain. This web scraping exercise produced 10,000 points on the cumulative distribution of income or consumption from each of 942 surveys spanning 127 countries over the period 1977 to 2012. This short note describes our methodology, briefly discusses some of the relevant intellectual property issues, and illustrates the kind of calculations that are facilitated by this data set, including growth incidence curves and poverty rates using alternative PPP indices.
RT @cblatts: “We Just Ran Twenty-Three Million Queries of the World Bank’s Website” http://t.co/00jMmjnsci <== Now that’s a complicated ETL
@cblatts BTW “data guerrillas” is excellent.
“We Just Ran Twenty-Three Million Queries of the World Bank’s Website” http://t.co/YXOfQEhHBP via @feedly
RT @cblatts: “We Just Ran Twenty-Three Million Queries of the World Bank’s Website” http://t.co/HxzopCj3fs
RT @MJSwart: RT @cblatts: “We Just Ran Twenty-Three Million Queries of the World Bank’s Website” http://t.co/00jMmjnsci <== Now that’s a co…
RT @cblatts: “We Just Ran Twenty-Three Million Queries of the World Bank’s Website” http://t.co/HxzopCj3fs
RT @cblatts: “We Just Ran Twenty-Three Million Queries of the World Bank’s Website” http://t.co/HxzopCj3fs
“@MJSwart: RT @cblatts: “We Just Ran Twenty-Three Million Queries of the World Bank’s Website” http://t.co/Y7JGAS5Src cc @iOnline247
RT @cblatts: “We Just Ran Twenty-Three Million Queries of the World Bank’s Website” http://t.co/HxzopCj3fs
Data guerrillas: http://t.co/vfWUsTs03R
@cblatts they should be careful. Aaron Swartz was indicted for less
RT @cblatts: Data guerrillas: http://t.co/vfWUsTs03R
RT @cblatts: Data guerrillas: http://t.co/vfWUsTs03R
MT @cblatts: “We Just Ran Twenty-Three Million Queries of the @WorldBank’s Website” @JustinSandefur http://t.co/7G8K5m3kV2
Nice Sala-i-Martin reference MT @CGDev: MT @cblatts “We Just Ran 23 Million Queries of WB Website” @JustinSandefur http://t.co/RFLxY0bfet
@cblatts Love the recs: publish the underlying code, embrace open data formats, release sufficient data so others can recreate estimates
“We Just Ran Twenty-Three Million Queries of the World Bank’s Website” http://t.co/6WrEmRHKjl
RT @cblatts: Data guerrillas: http://t.co/vfWUsTs03R
RT @cblatts: Data guerrillas: http://t.co/vfWUsTs03R
“We Just Ran Twenty-Three Million Queries of the World Bank’s Website”: That’s the title of a new paper by dat… http://t.co/MCAKkQwqPI
RT @CGDev: MT @cblatts: “We Just Ran Twenty-Three Million Queries of the @WorldBank’s Website” @JustinSandefur http://t.co/7G8K5m3kV2
@cblatts Here’s a related thing: guerrilla polling: http://t.co/YUD1KjRToQ Obviously more dangerous though.
How to crash the World Bank website in the name of social science. http://t.co/JIxOgpoUjs
RT @cblatts: How to crash the World Bank website in the name of social science. http://t.co/JIxOgpoUjs
RT @cblatts: How to crash the World Bank website in the name of social science. http://t.co/JIxOgpoUjs
RT @cblatts: How to crash the World Bank website in the name of social science. http://t.co/JIxOgpoUjs
RT @cblatts: How to crash the World Bank website in the name of social science. http://t.co/JIxOgpoUjs
RT @cblatts: How to crash the World Bank website in the name of social science. http://t.co/JIxOgpoUjs
RT @cblatts: How to crash the World Bank website in the name of social science. http://t.co/JIxOgpoUjs
RT @cblatts: How to crash the World Bank website in the name of social science. http://t.co/JIxOgpoUjs
Another @cblatts piece – Data Guerrillas (cc @UKODITech @Floppy @statshero) http://t.co/31Tj3sKYUg
“We Just Ran Twenty-Three Million Queries of the World Bank’s Website” – Chris … http://t.co/BY9sIiacKR, see more http://t.co/M4stgKjNoH
RT @cblatts: How to crash the World Bank website in the name of social science. http://t.co/JIxOgpoUjs
“We Just Ran Twenty-Three Million Queries of the World Bank’s Website” http://t.co/DDBQxzen6W
RT @cblatts: How to crash the World Bank website in the name of social science. http://t.co/JIxOgpoUjs
“We Just Ran Twenty-Three Million Queries of the World Bank’s Website” http://t.co/4XPpFTSGJc via @jetpack
RT @cblatts: How to crash the World Bank website in the name of social science. http://t.co/JIxOgpoUjs
interesting →”@cblatts: How to crash the World Bank website in the name of social science. http://t.co/4rqOpvatk6“
“We Just Ran Twenty-Three Million Queries of the World Bank’s Website” http://t.co/G7fXT4Gj9l via @jetpack
“We Just Ran Twenty-Three Million Queries of the World Bank’s Website” http://t.co/SLtfaufBkl
RT @cblatts: Data guerrillas: http://t.co/vfWUsTs03R
DataGuerrillero The revolution [of data]’s not an apple that falls when it is ripe. You have to make it fall @cblatts http://t.co/BS4dvDpT1c
Great resource. RT @cblatts Data guerrillas: http://t.co/NTVj7GT30P