A comparison of sequencing platforms and bioinformatics pipelines for compositional analysis of the gut microbiome

BMC Microbiol. 2017 Sep 13;17(1):194. doi: 10.1186/s12866-017-1101-8.


Background: Advancements in Next Generation Sequencing (NGS) technologies regarding throughput, read length and accuracy had a major impact on microbiome research by significantly improving 16S rRNA amplicon sequencing. As rapid improvements in sequencing platforms and new data analysis pipelines are introduced, it is essential to evaluate their capabilities in specific applications. The aim of this study was to assess whether the same project-specific biological conclusions regarding microbiome composition could be reached using different sequencing platforms and bioinformatics pipelines.

Results: Chicken cecum microbiome was analyzed by 16S rRNA amplicon sequencing using Illumina MiSeq, Ion Torrent PGM, and Roche 454 GS FLX Titanium platforms, with standard and modified protocols for library preparation. We labeled the bioinformatics pipelines included in our analysis QIIME1 and QIIME2 (de novo OTU picking [not to be confused with QIIME version 2 commonly referred to as QIIME2]), QIIME3 and QIIME4 (open reference OTU picking), UPARSE1 and UPARSE2 (each pair differs only in the use of chimera depletion methods), and DADA2 (for Illumina data only). GS FLX+ yielded the longest reads and highest quality scores, while MiSeq generated the largest number of reads after quality filtering. Declines in quality scores were observed starting at bases 150-199 for GS FLX+ and bases 90-99 for MiSeq. Scores were stable for PGM-generated data. Overall microbiome compositional profiles were comparable between platforms; however, average relative abundance of specific taxa varied depending on sequencing platform, library preparation method, and bioinformatics analysis. Specifically, QIIME with de novo OTU picking yielded the highest number of unique species and alpha diversity was reduced with UPARSE and DADA2 compared to QIIME.

Conclusions: The three platforms compared in this study were capable of discriminating samples by treatment, despite differences in diversity and abundance, leading to similar biological conclusions. Our results demonstrate that while there were differences in depth of coverage and phylogenetic diversity, all workflows revealed comparable treatment effects on microbial diversity. To increase reproducibility and reliability and to retain consistency between similar studies, it is important to consider the impact on data quality and relative abundance of taxa when selecting NGS platforms and analysis tools for microbiome studies.

Keywords: 16S rRNA amplicon sequencing - microbiome analysis - microbiome - microbiome composition - next generation sequencing platforms; Bioinformatics pipeline; NGS bias.

MeSH terms

  • Analysis of Variance
  • Animals
  • Bacteria / classification*
  • Bacteria / genetics*
  • Base Sequence
  • Biodiversity
  • Cecum / microbiology
  • Chickens / microbiology
  • Computational Biology / instrumentation
  • Computational Biology / methods*
  • DNA, Bacterial / analysis
  • DNA, Bacterial / genetics
  • DNA, Bacterial / isolation & purification
  • Gastrointestinal Microbiome / genetics*
  • Gene Library
  • High-Throughput Nucleotide Sequencing / instrumentation
  • High-Throughput Nucleotide Sequencing / methods*
  • Microbial Consortia / genetics
  • Multivariate Analysis
  • Phylogeny
  • RNA, Ribosomal, 16S / genetics
  • Reproducibility of Results
  • Statistics as Topic


  • DNA, Bacterial
  • RNA, Ribosomal, 16S