Systematic Evaluation of Sanger Validation of Next-Generation Sequencing Variants

Clin Chem. 2016 Apr;62(4):647-54. doi: 10.1373/clinchem.2015.249623. Epub 2016 Feb 4.

Abstract

Background: Next-generation sequencing (NGS) data are used for both clinical care and clinical research. DNA sequence variants identified using NGS are often returned to patients/participants as part of clinical or research protocols. The current standard of care is to validate NGS variants using Sanger sequencing, which is costly and time-consuming.

Methods: We performed a large-scale, systematic evaluation of Sanger-based validation of NGS variants using data from the ClinSeq® project. We first used NGS data from 19 genes in 5 participants, comparing them to high-throughput Sanger sequencing results on the same samples, and found no discrepancies among 234 NGS variants. We then compared NGS variants in 5 genes from 684 participants against data from Sanger sequencing.

Results: Of over 5800 NGS-derived variants, 19 were not validated by Sanger data. Using newly designed sequencing primers, Sanger sequencing confirmed 17 of the NGS variants, and the remaining 2 variants had low quality scores from exome sequencing. Overall, we measured a validation rate of 99.965% for NGS variants using Sanger sequencing, which was higher than many existing medical tests that do not necessitate orthogonal validation.

Conclusions: A single round of Sanger sequencing is more likely to incorrectly refute a true-positive variant from NGS than to correctly identify a false-positive variant from NGS. Validation of NGS-derived variants using Sanger sequencing has limited utility, and best practice standards should not include routine orthogonal Sanger validation of NGS variants.

Publication types

  • Research Support, N.I.H., Intramural

MeSH terms

  • Animals
  • DNA / blood
  • DNA / genetics
  • Genetic Variation / genetics*
  • Genomics / methods*
  • High-Throughput Nucleotide Sequencing / methods*
  • High-Throughput Nucleotide Sequencing / standards
  • High-Throughput Nucleotide Sequencing / statistics & numerical data
  • Humans
  • Models, Statistical
  • Sequence Analysis, DNA / methods*
  • Sequence Analysis, DNA / standards
  • Sequence Analysis, DNA / statistics & numerical data

Substances

  • DNA