The variability of the 16S rRNA gene in bacterial genomes and its consequences for bacterial community analyses

PLoS One. 2013;8(2):e57923. doi: 10.1371/journal.pone.0057923. Epub 2013 Feb 27.


16S ribosomal RNA currently represents the most important target of study in bacterial ecology. Its use for the description of bacterial diversity is, however, limited by the presence of variable copy numbers in bacterial genomes and sequence variation within closely related taxa or within a genome. Here we use the information from sequenced bacterial genomes to explore the variability of 16S rRNA sequences and copy numbers at various taxonomic levels and apply it to estimate bacterial genome and DNA abundances. In total, 7,081 16S rRNA sequences were in silico extracted from 1,690 available bacterial genomes (1-15 per genome). While there are several phyla containing low 16S rRNA copy numbers, in certain taxa, e.g., the Firmicutes and Gammaproteobacteria, the variation is large. Genome sizes are more conserved at all tested taxonomic levels than 16S rRNA copy numbers. Only a minority of bacterial genomes harbors identical 16S rRNA gene copies, and sequence diversity increases with increasing copy numbers. While certain taxa harbor dissimilar 16S rRNA genes, others contain sequences common to multiple species. Sequence identity clusters (often termed operational taxonomic units) thus provide an imperfect representation of bacterial taxa of a certain phylogenetic rank. We have demonstrated that the information on 16S rRNA copy numbers and genome sizes of genome-sequenced bacteria may be used as an estimate for the closest related taxon in an environmental dataset to calculate alternative estimates of the relative abundance of individual bacterial taxa in environmental samples. Using an example from forest soil, this procedure would increase the abundance estimates of Acidobacteria and decrease these of Firmicutes. Using the currently available information, alternative estimates of bacterial community composition may be obtained in this way if the variation of 16S rRNA copy numbers among bacteria is considered.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computer Simulation
  • DNA, Bacterial / genetics
  • Ecosystem*
  • Gene Dosage / genetics
  • Genetic Variation*
  • Genome Size / genetics
  • Genome, Bacterial / genetics*
  • Phylogeny
  • RNA, Ribosomal, 16S / genetics*
  • Sequence Analysis, DNA
  • Soil Microbiology
  • Temperature
  • Trees / microbiology


  • DNA, Bacterial
  • RNA, Ribosomal, 16S

Grant support

This work was supported by the Ministry of Education, Youth and Sports of the Czech Republic (LD12048, LD12050) and by the research concept of the Institute of Microbiology of the Academy of Sciences of the Czech Republic (RVO61388971). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.