Utility of Whole-Genome Sequencing of Escherichia coli O157 for Outbreak Detection and Epidemiological Surveillance

J Clin Microbiol. 2015 Nov;53(11):3565-73. doi: 10.1128/JCM.01066-15. Epub 2015 Sep 9.


Detailed laboratory characterization of Escherichia coli O157 is essential to inform epidemiological investigations. This study assessed the utility of whole-genome sequencing (WGS) for outbreak detection and epidemiological surveillance of E. coli O157, and the data were used to identify discernible associations between genotypes and clinical outcomes. One hundred five E. coli O157 strains isolated over a 5-year period from human fecal samples in Lothian, Scotland, were sequenced with the Ion Torrent Personal Genome Machine. A total of 8,721 variable sites in the core genome were identified among the 105 isolates; 47% of the single nucleotide polymorphisms (SNPs) were attributable to six "atypical" E. coli O157 strains and included recombinant regions. Phylogenetic analyses showed that WGS correlated well with the epidemiological data. Epidemiological links existed between cases whose isolates differed by three or fewer SNPs. WGS also correlated well with multilocus variable-number tandem repeat analysis (MLVA) typing data, with only three discordant results observed, all among isolates from cases not known to be epidemiologically related. WGS produced a better-supported, higher-resolution phylogeny than MLVA, confirming that the method is more suitable for epidemiological surveillance of E. coli O157. A combination of in silico analyses (VirulenceFinder, ResFinder, and local BLAST searches) were used to determine stx subtypes, multilocus sequence types (15 loci), and the presence of virulence and acquired antimicrobial resistance genes. There was a high level of correlation between the WGS data and our routine typing methods, although some discordant results were observed, mostly related to the limitation of short sequence read assembly. The data were used to identify sublineages and clades of E. coli O157, and when they were correlated with the clinical outcome data, they showed that one clade, Ic3, was significantly associated with severe disease. Together, the results show that WGS data can provide higher resolution of the relationships between E. coli O157 isolates than that provided by MLVA. The method has the potential to streamline the laboratory workflow and provide detailed information for the clinical management of patients and public health interventions.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Base Sequence
  • DNA, Bacterial / genetics
  • Disease Outbreaks / statistics & numerical data*
  • Drug Resistance, Multiple, Bacterial / genetics*
  • Epidemiological Monitoring
  • Escherichia coli Infections / epidemiology*
  • Escherichia coli Infections / microbiology
  • Escherichia coli O157 / genetics*
  • Escherichia coli O157 / isolation & purification
  • Escherichia coli O157 / pathogenicity
  • Feces / microbiology
  • Genome, Bacterial / genetics*
  • Humans
  • Molecular Sequence Data
  • Multilocus Sequence Typing
  • Polymorphism, Single Nucleotide / genetics
  • Scotland
  • Sequence Analysis, DNA
  • Shiga Toxins / classification
  • Shiga Toxins / genetics
  • Virulence Factors / genetics


  • DNA, Bacterial
  • Shiga Toxins
  • Virulence Factors