Web Data: The Power of Social Media and Web Searches to Identify Disease Outbreaks

Can web information help to identify disease outbreaks earlier than ever before?

There has been a rapid rise in the number of people using the web and social media across both the developed and developing world. Each day millions of people self‐report their symptoms online, for example “fever”, “cough”, “sore throat”, via social media such as Twitter.  

Increasingly, people are using the internet to search for information about their own health. In the UK it is estimated that more than 80% of all internet users search for information about health. The number of tweets and searches for influenza‐like illness (ILI) increases during flu season and this anonymised data can help to track outbreaks across populations, almost instantaneously, and with geographically-linked information. Yahoo and Google have shown that searches can pick up outbreaks up to two weeks earlier than traditional disease surveillance.

i­‐sense Research Associate Vasileios Lampos was one of the first people to report the use of Twitter data for flu epidemiology. Tracking web-­data also enables a larger proportion of the population to be assessed compared with traditional health surveillance methods.

However, symptoms are not a diagnosis and different diseases can share common symptoms. Therefore accurate diagnosis of the underlying infectious agent remains the cornerstone of early warning systems, since this informs the correct interventions. 

While a number of leading researchers are developing Twitter, search engine or mobile phone‐connected tests, our unique approach will integrate all of these data sources to build advanced sensing systems, which will detect infectious diseases with specificity and in real-time such that we can enable rapid and effective public health interventions.

We aim to overcome the challenges of this task through improving the accuracy of big data methods. For instance, developing tools for some of the problems that Google Flu Trends has recently encountered, including surges in media interest, which distorts reported numbers of self-reported symptoms.

