Primer ID Validates Template Sampling Depth and Greatly Reduces the Error Rate of Next-Generation Sequencing of HIV-1 Genomic RNA Populations
- PMID: 26041299
- PMCID: PMC4524263
- DOI: 10.1128/JVI.00522-15
Primer ID Validates Template Sampling Depth and Greatly Reduces the Error Rate of Next-Generation Sequencing of HIV-1 Genomic RNA Populations
Abstract
Validating the sampling depth and reducing sequencing errors are critical for studies of viral populations using next-generation sequencing (NGS). We previously described the use of Primer ID to tag each viral RNA template with a block of degenerate nucleotides in the cDNA primer. We now show that low-abundance Primer IDs (offspring Primer IDs) are generated due to PCR/sequencing errors. These artifactual Primer IDs can be removed using a cutoff model for the number of reads required to make a template consensus sequence. We have modeled the fraction of sequences lost due to Primer ID resampling. For a typical sequencing run, less than 10% of the raw reads are lost to offspring Primer ID filtering and resampling. The remaining raw reads are used to correct for PCR resampling and sequencing errors. We also demonstrate that Primer ID reveals bias intrinsic to PCR, especially at low template input or utilization. cDNA synthesis and PCR convert ca. 20% of RNA templates into recoverable sequences, and 30-fold sequence coverage recovers most of these template sequences. We have directly measured the residual error rate to be around 1 in 10,000 nucleotides. We use this error rate and the Poisson distribution to define the cutoff to identify preexisting drug resistance mutations at low abundance in an HIV-infected subject. Collectively, these studies show that >90% of the raw sequence reads can be used to validate template sampling depth and to dramatically reduce the error rate in assessing a genetically diverse viral population using NGS.
Importance: Although next-generation sequencing (NGS) has revolutionized sequencing strategies, it suffers from serious limitations in defining sequence heterogeneity in a genetically diverse population, such as HIV-1 due to PCR resampling and PCR/sequencing errors. The Primer ID approach reveals the true sampling depth and greatly reduces errors. Knowing the sampling depth allows the construction of a model of how to maximize the recovery of sequences from input templates and to reduce resampling of the Primer ID so that appropriate multiplexing can be included in the experimental design. With the defined sampling depth and measured error rate, we are able to assign cutoffs for the accurate detection of minority variants in viral populations. This approach allows the power of NGS to be realized without having to guess about sampling depth or to ignore the problem of PCR resampling, while also being able to correct most of the errors in the data set.
Copyright © 2015, American Society for Microbiology. All Rights Reserved.
Figures
Similar articles
-
Applying next-generation sequencing to unravel the mutational landscape in viral quasispecies.Virus Res. 2020 Jul 2;283:197963. doi: 10.1016/j.virusres.2020.197963. Epub 2020 Apr 9. Virus Res. 2020. PMID: 32278821 Free PMC article. Review.
-
Primer ID Informs Next-Generation Sequencing Platforms and Reveals Preexisting Drug Resistance Mutations in the HIV-1 Reverse Transcriptase Coding Domain.AIDS Res Hum Retroviruses. 2015 Jun;31(6):658-68. doi: 10.1089/AID.2014.0031. Epub 2015 Apr 2. AIDS Res Hum Retroviruses. 2015. PMID: 25748056 Free PMC article.
-
Ultrasensitive single-genome sequencing: accurate, targeted, next generation sequencing of HIV-1 RNA.Retrovirology. 2016 Dec 20;13(1):87. doi: 10.1186/s12977-016-0321-6. Retrovirology. 2016. PMID: 27998286 Free PMC article.
-
Challenges with using primer IDs to improve accuracy of next generation sequencing.PLoS One. 2015 Mar 5;10(3):e0119123. doi: 10.1371/journal.pone.0119123. eCollection 2015. PLoS One. 2015. PMID: 25741706 Free PMC article.
-
Next Generation Sequencing-based analysis of RNA polymerase functions.Methods. 2015 Sep 15;86:37-44. doi: 10.1016/j.ymeth.2015.04.030. Epub 2015 May 1. Methods. 2015. PMID: 25937393 Review.
Cited by
-
Primer ID Next-Generation Sequencing for the Analysis of a Broad Spectrum Antiviral Induced Transition Mutations and Errors Rates in a Coronavirus Genome.Bio Protoc. 2021 Mar 5;11(5):e3938. doi: 10.21769/BioProtoc.3938. eCollection 2021 Mar 5. Bio Protoc. 2021. PMID: 33796612 Free PMC article.
-
Short Communication: Analysis of Minor Populations of Human Immunodeficiency Virus by Primer Identification and Insertion-Deletion and Carry Forward Correction Pipelines.AIDS Res Hum Retroviruses. 2016 Mar;32(3):296-302. doi: 10.1089/AID.2015.0202. Epub 2015 Dec 15. AIDS Res Hum Retroviruses. 2016. PMID: 26537573 Free PMC article.
-
Resolution of Specific Nucleotide Mismatches by Wild-Type and AZT-Resistant Reverse Transcriptases during HIV-1 Replication.J Mol Biol. 2016 Jun 5;428(11):2275-2288. doi: 10.1016/j.jmb.2016.04.005. Epub 2016 Apr 10. J Mol Biol. 2016. PMID: 27075671 Free PMC article.
-
An orally bioavailable broad-spectrum antiviral inhibits SARS-CoV-2 in human airway epithelial cell cultures and multiple coronaviruses in mice.Sci Transl Med. 2020 Apr 29;12(541):eabb5883. doi: 10.1126/scitranslmed.abb5883. Epub 2020 Apr 6. Sci Transl Med. 2020. PMID: 32253226 Free PMC article.
-
Applying next-generation sequencing to unravel the mutational landscape in viral quasispecies.Virus Res. 2020 Jul 2;283:197963. doi: 10.1016/j.virusres.2020.197963. Epub 2020 Apr 9. Virus Res. 2020. PMID: 32278821 Free PMC article. Review.
References
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
