Skip to main content

Web Data: The Power of Social Media and Web Searches to Identify Disease Outbreaks

Home > Research > Social Media Public Health

Can web information help to identify disease outbreaks earlier than ever before?

There has been a rapid rise in the number of people using the web and social media across both the developed and developing world. Each day millions of people self‐report their symptoms online, for example “fever”, “cough”, “sore throat”, via social media such as Twitter.  

Increasingly, people are using the internet to search for information about their own health. In the UK it is estimated that more than 80% of all internet users search for information about health. The number of tweets and searches for influenza‐like illness (ILI) increases during flu season and this anonymised data can help to track outbreaks across populations, almost instantaneously, and with geographically-linked information. Yahoo and Google have shown that searches can pick up outbreaks up to two weeks earlier than traditional disease surveillance.

i­‐sense Research Associate Vasileios Lampos was one of the first people to report the use of Twitter data for flu epidemiology. Tracking web-­data also enables a larger proportion of the population to be assessed compared with traditional health surveillance methods.

However, symptoms are not a diagnosis and different diseases can share common symptoms. Therefore accurate diagnosis of the underlying infectious agent remains the cornerstone of early warning systems, since this informs the correct interventions. 

While a number of leading researchers are developing Twitter, search engine or mobile phone‐connected tests, our unique approach will integrate all of these data sources to build advanced sensing systems, which will detect infectious diseases with specificity and in real-time such that we can enable rapid and effective public health interventions.

We aim to overcome the challenges of this task through improving the accuracy of big data methods. For instance, developing tools for some of the problems that Google Flu Trends has recently encountered, including surges in media interest, which distorts reported numbers of self-reported symptoms.

Related papers

Lampos, V., Zou, B., Cox, I.J. 'Enhancing Feature Selection Using Word Embeddings: The Case of Flu Surveillance' WWW '17 Proceedings of the 26th International Conference on World Wide Web (2017); DOI: 10.1145/3038912.3052622

Aletras, N., Tsarapatsanis, D., Preotiuc-Pietro, D., Lampos, V. 'Predicting judicial decisions of the European Court of Human Rights: a Natural Language Processing perspective.' PeerJ Computer Science 2:e93 (2016); DOI: 10.7717/peerj-cs.93

Lampos V., Aletras N., Geyti J.K., Zou B., Cox I.J. 'Inferring the Socioeconomic Status of Social Media Users Based on Behaviour and Language.' Advances in Information Retrieval. ECIR 2016. Lecture Notes in Computer Science, vol 9626. Springer, Cham (2016); DOI: 10.1007/978-3-319-30671-1_54

Zou, B., Lampos, V., Gorton, R. & Cox, I.J. 'On Infectious Intestinal Disease Surveillance using Social Media Content' Proceedings of the 6th International Conference on Digital Health (2016); PDF.

Related links

Google Flu Trends revisited: Improving influenza modelling from search query logs

Assessing the impact of health interventions using Internet data: An i-sense, UCL, PHE and Microsoft study

i-sense and Microsoft study: Detecting Disease Outbreaks in Mass Gatherings Using Internet Data

Flagship 2: Early Warning Sensing Systems for Influenza

Google Flu Trends

Larry Brilliant et al: Detecting influenza epidemics using search engine query data (Nature, 2009)