Detecting influenza epidemics using search engine query data

Nature. 2009 Feb 19;457(7232):1012-4. doi: 10.1038/nature07634.

Abstract

Seasonal influenza epidemics are a major public health concern, causing tens of millions of respiratory illnesses and 250,000 to 500,000 deaths worldwide each year. In addition to seasonal influenza, a new strain of influenza virus against which no previous immunity exists and that demonstrates human-to-human transmission could result in a pandemic with millions of fatalities. Early detection of disease activity, when followed by a rapid response, can reduce the impact of both seasonal and pandemic influenza. One way to improve early detection is to monitor health-seeking behaviour in the form of queries to online search engines, which are submitted by millions of users around the world each day. Here we present a method of analysing large numbers of Google search queries to track influenza-like illness in a population. Because the relative frequency of certain queries is highly correlated with the percentage of physician visits in which a patient presents with influenza-like symptoms, we can accurately estimate the current level of weekly influenza activity in each region of the United States, with a reporting lag of about one day. This approach may make it possible to use search queries to detect influenza epidemics in areas with a large population of web search users.

MeSH terms

  • Centers for Disease Control and Prevention, U.S.
  • Databases, Factual
  • Health Behavior*
  • Health Education / statistics & numerical data*
  • Humans
  • Influenza, Human / diagnosis
  • Influenza, Human / epidemiology*
  • Influenza, Human / transmission
  • Influenza, Human / virology
  • Internationality
  • Internet / statistics & numerical data*
  • Linear Models
  • Office Visits / statistics & numerical data
  • Population Surveillance / methods*
  • Reproducibility of Results
  • Seasons
  • Time Factors
  • United States
  • User-Computer Interface*