PCR biases distort bacterial and archaeal community structure in pyrosequencing datasets

PLoS One. 2012;7(8):e43093. doi: 10.1371/journal.pone.0043093. Epub 2012 Aug 15.


As 16S rRNA gene targeted massively parallel sequencing has become a common tool for microbial diversity investigations, numerous advances have been made to minimize the influence of sequencing and chimeric PCR artifacts through rigorous quality control measures. However, there has been little effort towards understanding the effect of multi-template PCR biases on microbial community structure. In this study, we used three bacterial and three archaeal mock communities consisting of, respectively, 33 bacterial and 24 archaeal 16S rRNA gene sequences combined in different proportions to compare the influences of (1) sequencing depth, (2) sequencing artifacts (sequencing errors and chimeric PCR artifacts), and (3) biases in multi-template PCR, towards the interpretation of community structure in pyrosequencing datasets. We also assessed the influence of each of these three variables on α- and β-diversity metrics that rely on the number of OTUs alone (richness) and those that include both membership and the relative abundance of detected OTUs (diversity). As part of this study, we redesigned bacterial and archaeal primer sets that target the V3-V5 region of the 16S rRNA gene, along with multiplexing barcodes, to permit simultaneous sequencing of PCR products from the two domains. We conclude that the benefits of deeper sequencing efforts extend beyond greater OTU detection and result in higher precision in β-diversity analyses by reducing the variability between replicate libraries, despite the presence of more sequencing artifacts. Additionally, spurious OTUs resulting from sequencing errors have a significant impact on richness or shared-richness based α- and β-diversity metrics, whereas metrics that utilize community structure (including both richness and relative abundance of OTUs) are minimally affected by spurious OTUs. However, the greatest obstacle towards accurately evaluating community structure are the errors in estimated mean relative abundance of each detected OTU due to biases associated with multi-template PCR reactions.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Archaea / genetics*
  • Bacteria / genetics*
  • Computational Biology / methods
  • DNA Primers / genetics
  • Databases, Genetic
  • Genes, Archaeal
  • Genes, Bacterial
  • Genetic Techniques
  • Molecular Sequence Data
  • Polymerase Chain Reaction / methods*
  • RNA, Ribosomal, 16S / genetics
  • Reproducibility of Results
  • Sequence Analysis, DNA / methods*


  • DNA Primers
  • RNA, Ribosomal, 16S

Associated data

  • GENBANK/JQ346727
  • GENBANK/JQ346728
  • GENBANK/JQ346729
  • GENBANK/JQ346730
  • GENBANK/JQ346731
  • GENBANK/JQ346732
  • GENBANK/JQ346733
  • GENBANK/JQ346734
  • GENBANK/JQ346735
  • GENBANK/JQ346736
  • GENBANK/JQ346737
  • GENBANK/JQ346738
  • GENBANK/JQ346739
  • GENBANK/JQ346740
  • GENBANK/JQ346741
  • GENBANK/JQ346742
  • GENBANK/JQ346743
  • GENBANK/JQ346744
  • GENBANK/JQ346745
  • GENBANK/JQ346746
  • GENBANK/JQ346747
  • GENBANK/JQ346748
  • GENBANK/JQ346749
  • GENBANK/JQ346750
  • GENBANK/JQ346751
  • GENBANK/JQ346752
  • GENBANK/JQ346753
  • GENBANK/JQ346754
  • GENBANK/JQ346755
  • GENBANK/JQ346756
  • GENBANK/JQ346757
  • GENBANK/JQ346758
  • GENBANK/JQ346759
  • GENBANK/JQ346760
  • GENBANK/JQ346761
  • GENBANK/JQ346762
  • GENBANK/JQ346763
  • GENBANK/JQ346764
  • GENBANK/JQ346765
  • GENBANK/JQ346766
  • GENBANK/JQ346767
  • GENBANK/JQ346768
  • GENBANK/JQ346769
  • GENBANK/JQ346770
  • GENBANK/JQ346771
  • GENBANK/JQ346772
  • GENBANK/JQ346773
  • GENBANK/JQ346774
  • GENBANK/JQ346775
  • GENBANK/JQ346776
  • GENBANK/JQ346777
  • GENBANK/JQ346778
  • GENBANK/JQ346779
  • GENBANK/JQ346780
  • GENBANK/JQ346781
  • GENBANK/JQ346782

Grant support

This research was partially supported by United States National Science Foundation grants BES-0412618, CBET-0967707, and CBET-1133793, and Water Research Foundation Tailored Collaboration project no. 4346. No additional external funding received for this study. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.