Bayesian reconstruction of disease outbreaks by combining epidemiologic and genomic data

PLoS Comput Biol. 2014 Jan;10(1):e1003457. doi: 10.1371/journal.pcbi.1003457. Epub 2014 Jan 23.


Recent years have seen progress in the development of statistically rigorous frameworks to infer outbreak transmission trees ("who infected whom") from epidemiological and genetic data. Making use of pathogen genome sequences in such analyses remains a challenge, however, with a variety of heuristic approaches having been explored to date. We introduce a statistical method exploiting both pathogen sequences and collection dates to unravel the dynamics of densely sampled outbreaks. Our approach identifies likely transmission events and infers dates of infections, unobserved cases and separate introductions of the disease. It also proves useful for inferring numbers of secondary infections and identifying heterogeneous infectivity and super-spreaders. After testing our approach using simulations, we illustrate the method with the analysis of the beginning of the 2003 Singaporean outbreak of Severe Acute Respiratory Syndrome (SARS), providing new insights into the early stage of this epidemic. Our approach is the first tool for disease outbreak reconstruction from genetic data widely available as free software, the R package outbreaker. It is applicable to various densely sampled epidemics, and improves previous approaches by detecting unobserved and imported cases, as well as allowing multiple introductions of the pathogen. Because of its generality, we believe this method will become a tool of choice for the analysis of densely sampled disease outbreaks, and will form a rigorous framework for subsequent methodological developments.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Bayes Theorem
  • Communicable Disease Control
  • Computer Simulation
  • Disease Outbreaks*
  • Entropy
  • Epidemics
  • Genome, Viral
  • Genomics
  • Humans
  • Mutation
  • Probability
  • Programming Languages
  • SARS Virus / genetics*
  • Severe Acute Respiratory Syndrome / epidemiology*
  • Severe Acute Respiratory Syndrome / virology*
  • Singapore
  • Software

Grant support

We acknowledge research funding from the NIGMS MIDAS initiative, the Bill & Melinda Gates Foundation, the European Union FP7 EMPERIE and PREDEMICS projects, and the Medical Research Council. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.