Species-level analysis of DNA sequence data from the NIH Human Microbiome Project

PLoS One. 2012;7(10):e47075. doi: 10.1371/journal.pone.0047075. Epub 2012 Oct 10.


Background: Outbreaks of antibiotic-resistant bacterial infections emphasize the importance of surveillance of potentially pathogenic bacteria. Genomic sequencing of clinical microbiological specimens expands our capacity to study cultivable, fastidious and uncultivable members of the bacterial community. Herein, we compared the primary data collected by the NIH's Human Microbiome Project (HMP) with published epidemiological surveillance data of Staphylococcus aureus.

Methods: The HMP's initial dataset contained microbial survey data from five body regions (skin, nares, oral cavity, gut and vagina) of 242 healthy volunteers. A significant component of the HMP dataset was deep sequencing of the 16S ribosomal RNA gene, which contains variable regions enabling taxonomic classification. Since species-level identification is essential in clinical microbiology, we built a reference database and used phylogenetic placement followed by most recent common ancestor classification to look at the species distribution for Staphylococcus, Klebsiella and Enterococcus.

Main results: We show that selecting the accurate region of the 16S rRNA gene to sequence is analogous to carefully selecting culture conditions to distinguish closely related bacterial species. Analysis of the HMP data showed that Staphylococcus aureus was present in the nares of 36% of healthy volunteers, consistent with culture-based epidemiological data. Klebsiella pneumoniae and Enterococcus faecalis were found less frequently, but across many habitats.

Conclusions: This work demonstrates that large 16S rRNA survey studies can be used to support epidemiological goals in the context of an increasing awareness that microbes flourish and compete within a larger bacterial community. This study demonstrates how genomic techniques and information could be critically important to trace microbial evolution and implement hospital infection control.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, N.I.H., Intramural

MeSH terms

  • Databases, Nucleic Acid
  • Drug Resistance, Bacterial
  • Enterococcus faecalis / classification
  • Enterococcus faecalis / genetics*
  • Female
  • Gastrointestinal Tract / microbiology
  • Humans
  • Klebsiella pneumoniae / classification
  • Klebsiella pneumoniae / genetics*
  • Metagenome*
  • Mouth / microbiology
  • National Institutes of Health (U.S.)
  • Phylogeny
  • RNA, Ribosomal, 16S*
  • Sequence Analysis, DNA
  • Skin / microbiology
  • Staphylococcus aureus / classification
  • Staphylococcus aureus / genetics*
  • United States
  • Vagina / microbiology


  • RNA, Ribosomal, 16S