Integrating metagenomic and amplicon databases to resolve the phylogenetic and ecological diversity of the Chlamydiae

ISME J. 2014 Jan;8(1):115-25. doi: 10.1038/ismej.2013.142. Epub 2013 Aug 15.

Abstract

In the era of metagenomics and amplicon sequencing, comprehensive analyses of available sequence data remain a challenge. Here we describe an approach exploiting metagenomic and amplicon data sets from public databases to elucidate phylogenetic diversity of defined microbial taxa. We investigated the phylum Chlamydiae whose known members are obligate intracellular bacteria that represent important pathogens of humans and animals, as well as symbionts of protists. Despite their medical relevance, our knowledge about chlamydial diversity is still scarce. Most of the nine known families are represented by only a few isolates, while previous clone library-based surveys suggested the existence of yet uncharacterized members of this phylum. Here we identified more than 22,000 high quality, non-redundant chlamydial 16S rRNA gene sequences in diverse databases, as well as 1900 putative chlamydial protein-encoding genes. Even when applying the most conservative approach, clustering of chlamydial 16S rRNA gene sequences into operational taxonomic units revealed an unexpectedly high species, genus and family-level diversity within the Chlamydiae, including 181 putative families. These in silico findings were verified experimentally in one Antarctic sample, which contained a high diversity of novel Chlamydiae. In our analysis, the Rhabdochlamydiaceae, whose known members infect arthropods, represents the most diverse and species-rich chlamydial family, followed by the protist-associated Parachlamydiaceae, and a putative new family (PCF8) with unknown host specificity. Available information on the origin of metagenomic samples indicated that marine environments contain the majority of the newly discovered chlamydial lineages, highlighting this environment as an important chlamydial reservoir.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bacterial Proteins / genetics
  • Biodiversity*
  • Chlamydiaceae / classification*
  • Chlamydiaceae / genetics*
  • Databases, Genetic
  • Ecology
  • Environmental Microbiology
  • Metagenomics
  • Phylogeny*
  • RNA, Ribosomal, 16S / genetics

Substances

  • Bacterial Proteins
  • RNA, Ribosomal, 16S