Bayesian inference of transmission chains using timing of symptoms, pathogen genomes and contact data

PLoS Comput Biol. 2019 Mar 29;15(3):e1006930. doi: 10.1371/journal.pcbi.1006930. eCollection 2019 Mar.

Abstract

There exists significant interest in developing statistical and computational tools for inferring 'who infected whom' in an infectious disease outbreak from densely sampled case data, with most recent studies focusing on the analysis of whole genome sequence data. However, genomic data can be poorly informative of transmission events if mutations accumulate too slowly to resolve individual transmission pairs or if there exist multiple pathogens lineages within-host, and there has been little focus on incorporating other types of outbreak data. We present here a methodology that uses contact data for the inference of transmission trees in a statistically rigorous manner, alongside genomic data and temporal data. Contact data is frequently collected in outbreaks of pathogens spread by close contact, including Ebola virus (EBOV), severe acute respiratory syndrome coronavirus (SARS-CoV) and Mycobacterium tuberculosis (TB), and routinely used to reconstruct transmission chains. As an improvement over previous, ad-hoc approaches, we developed a probabilistic model that relates a set of contact data to an underlying transmission tree and integrated this in the outbreaker2 inference framework. By analyzing simulated outbreaks under various contact tracing scenarios, we demonstrate that contact data significantly improves our ability to reconstruct transmission trees, even under realistic limitations on the coverage of the contact tracing effort and the amount of non-infectious mixing between cases. Indeed, contact data is equally or more informative than fully sampled whole genome sequence data in certain scenarios. We then use our method to analyze the early stages of the 2003 SARS outbreak in Singapore and describe the range of transmission scenarios consistent with contact data and genetic sequence in a probabilistic manner for the first time. This simple yet flexible model can easily be incorporated into existing tools for outbreak reconstruction and should permit a better integration of genomic and epidemiological data for inferring transmission chains.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Bayes Theorem*
  • Communicable Diseases / transmission*
  • Communicable Diseases / virology
  • Computational Biology / methods*
  • Contact Tracing*
  • Disease Outbreaks / statistics & numerical data*
  • Genome, Viral / genetics*
  • Humans
  • Models, Biological
  • Severe Acute Respiratory Syndrome / transmission
  • Severe Acute Respiratory Syndrome / virology
  • Severe acute respiratory syndrome-related coronavirus / genetics
  • Singapore
  • Software