The power of web data: Assessing the impact of health interventions

Friday 31st July 2015

i-sense, UCL, Public Health England (PHE) and Microsoft researchers have proven the effectiveness of an England-wide flu vaccination programme by analysing tweets and Bing search queries. 

The study, led by UCL and i-sense researcher Vasileios Lampos, demonstrates how data generated by Internet users can be successfully used to assess the impact of health interventions, in this case the pilot Live Attenuated Influenza Vaccine (LAIV) campaign.

The UK, in an effort to reduce the spread of influenza in the general population, has introduced nation-wide vaccination programmes. Recognising that children are key factors in the transmission of the influenza virus, the LAIV programme was launched in several areas in England during the 2013/14 influenza season, with vaccines offered to school children aged from 4 to 11 years.

Using LAIV as a case study, researchers extended previous modelling approaches for analysing influenza rates from internet content and proposed a statistical framework for assessing the impact of a health intervention.  

The study involved processing millions of Twitter postings and Bing search queries that had been generated in the target vaccinated locations, as well as a broader set of control locations across England. The researchers began by assessing current data modelling techniques that could be used to predict the rates of influenza-like illness (ILI) from search queries and social media posts. They were able to significantly improve these techniques, making more accurate ILI estimates. Then they performed a statistical analysis on this data to evaluate the impact of the vaccination programme on the rates of flu in the affected areas.

Using web-based methods, meant the researchers could assess a different, and much broader part, of the population, that could be missed by traditional health surveillance methods, i.e. from monitoring doctor visits or hospitalisations, because it provides access to additional cases of infection, in people who may not use the healthcare system.

The findings show that there was a reduction in the rates of influenza-like illnesses in the vaccinated areas and therefore the LAIV campaign may have had a positive health impact. This supports the impact estimation findings provided by traditional health surveillance data, such as Public Health England’s.

This research provides further proof on the validity of statistical methods that map user-generated online data to conduct various types of inferences; in this case estimating the impact of a national health intervention. It also demonstrates the unique and valuable insight Internet data can give us into studying health at a population level.

This research, funded by i-sense, was published at Data Mining and Knowledge Discovery and presented at the journal track of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases 2015 (ECML PKDD) in September 2015 (

View the full slides here.

Journal link: Lampos, V., Yom-Tov, E., Pebody, R. & Cox, I.J. 'Assessing the impact of a health intervention via user-generated Internet content', Data Mining and Knowledge Discovery (2015); DOI: 10.1007/s10618-015-0427-9