Newer methods rely on search engines to index the web, then attempt to estimate coverage of different languages on the basis of the comparative frequency of words in different languages.
…In 1996, his research estimated that 80 percent of the content online was in English. That percentage fell steadily through successive experiments until 2005, when he estimated that 45 percent of online content was in English.
Ethan Zuckrman writing in Quartz on research by Álvaro Blanco.
@qz is consistently one of the best Twitter feeds I follow.