Robustness of Massively Parallel Sequencing Platforms

PLoS One. 2015 Sep 18;10(9):e0138259. doi: 10.1371/journal.pone.0138259. eCollection 2015.

Abstract

The improvements in high throughput sequencing technologies (HTS) made clinical sequencing projects such as ClinSeq and Genomics England feasible. Although there are significant improvements in accuracy and reproducibility of HTS based analyses, the usability of these types of data for diagnostic and prognostic applications necessitates a near perfect data generation. To assess the usability of a widely used HTS platform for accurate and reproducible clinical applications in terms of robustness, we generated whole genome shotgun (WGS) sequence data from the genomes of two human individuals in two different genome sequencing centers. After analyzing the data to characterize SNPs and indels using the same tools (BWA, SAMtools, and GATK), we observed significant number of discrepancies in the call sets. As expected, the most of the disagreements between the call sets were found within genomic regions containing common repeats and segmental duplications, albeit only a small fraction of the discordant variants were within the exons and other functionally relevant regions such as promoters. We conclude that although HTS platforms are sufficiently powerful for providing data for first-pass clinical tests, the variant predictions still need to be confirmed using orthogonal methods before using in clinical applications.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • DNA / genetics*
  • Genome, Human
  • Genotyping Techniques
  • High-Throughput Nucleotide Sequencing / economics
  • High-Throughput Nucleotide Sequencing / methods*
  • Humans
  • INDEL Mutation*
  • Polymorphism, Single Nucleotide*
  • Reproducibility of Results

Substances

  • DNA

Grant support

The project is supported by the Republic of Turkey Ministry of Development Infrastructure Grant (no: 2011K120020) and BILGEM—TUBITAK (The Scientific and Technological Research Council of Turkey) (grant no: T439000) to M.S.S. and B.Y., and a Marie Curie Career Integration Grant (303772) to C.A. The funder Republic of Turkey Ministry of Development Infrastructure (Award Number: 2011K120020) provided financial support in the form of Illumina HiSeq 2000 sequencing machine and preparation kits for data production and the funder BİLGEM–TÜBİTAK (The Scientific and Technological Research Council of Turkey) (Award Number: T439000) provided support in the form of salaries for authors (PK, MOK, MŞS), and the funder Marie Curie Career Integration Grant (Award Number: 303772) provided support in the form of salaries for author (CA) but did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. The specific roles of these authors are articulated in the “author contributions” section.