SNVer: a statistical tool for variant calling in analysis of pooled or individual next-generation sequencing data
- PMID: 21813454
- PMCID: PMC3201884
- DOI: 10.1093/nar/gkr599
SNVer: a statistical tool for variant calling in analysis of pooled or individual next-generation sequencing data
Abstract
We develop a statistical tool SNVer for calling common and rare variants in analysis of pooled or individual next-generation sequencing (NGS) data. We formulate variant calling as a hypothesis testing problem and employ a binomial-binomial model to test the significance of observed allele frequency against sequencing error. SNVer reports one single overall P-value for evaluating the significance of a candidate locus being a variant based on which multiplicity control can be obtained. This is particularly desirable because tens of thousands loci are simultaneously examined in typical NGS experiments. Each user can choose the false-positive error rate threshold he or she considers appropriate, instead of just the dichotomous decisions of whether to 'accept or reject the candidates' provided by most existing methods. We use both simulated data and real data to demonstrate the superior performance of our program in comparison with existing methods. SNVer runs very fast and can complete testing 300 K loci within an hour. This excellent scalability makes it feasible for analysis of whole-exome sequencing data, or even whole-genome sequencing data using high performance computing cluster. SNVer is freely available at http://snver.sourceforge.net/.
Figures
Similar articles
-
SNVerGUI: a desktop tool for variant analysis of next-generation sequencing data.J Med Genet. 2012 Dec;49(12):753-5. doi: 10.1136/jmedgenet-2012-101001. Epub 2012 Sep 28. J Med Genet. 2012. PMID: 23024288
-
A unified approach for allele frequency estimation, SNP detection and association studies based on pooled sequencing data using EM algorithms.BMC Genomics. 2013;14 Suppl 1(Suppl 1):S1. doi: 10.1186/1471-2164-14-S1-S1. Epub 2013 Jan 21. BMC Genomics. 2013. PMID: 23369070 Free PMC article.
-
Beta-Binomial Model for the Detection of Rare Mutations in Pooled Next-Generation Sequencing Experiments.J Comput Biol. 2017 Apr;24(4):357-367. doi: 10.1089/cmb.2016.0106. Epub 2016 Sep 15. J Comput Biol. 2017. PMID: 27632638
-
SNP calling by sequencing pooled samples.BMC Bioinformatics. 2012 Sep 20;13:239. doi: 10.1186/1471-2105-13-239. BMC Bioinformatics. 2012. PMID: 22992255 Free PMC article.
-
Best practices for variant calling in clinical sequencing.Genome Med. 2020 Oct 26;12(1):91. doi: 10.1186/s13073-020-00791-w. Genome Med. 2020. PMID: 33106175 Free PMC article. Review.
Cited by
-
LoFreq: a sequence-quality aware, ultra-sensitive variant caller for uncovering cell-population heterogeneity from high-throughput sequencing datasets.Nucleic Acids Res. 2012 Dec;40(22):11189-201. doi: 10.1093/nar/gks918. Epub 2012 Oct 12. Nucleic Acids Res. 2012. PMID: 23066108 Free PMC article.
-
A SNP resource for Douglas-fir: de novo transcriptome assembly and SNP detection and validation.BMC Genomics. 2013 Feb 28;14:137. doi: 10.1186/1471-2164-14-137. BMC Genomics. 2013. PMID: 23445355 Free PMC article.
-
The road from next-generation sequencing to personalized medicine.Per Med. 2014;11(5):523-544. doi: 10.2217/pme.14.34. Per Med. 2014. PMID: 26000024 Free PMC article.
-
Safety, infectivity and immunogenicity of a genetically attenuated blood-stage malaria vaccine.BMC Med. 2021 Nov 22;19(1):293. doi: 10.1186/s12916-021-02150-x. BMC Med. 2021. PMID: 34802442 Free PMC article. Clinical Trial.
-
A 4-cyano-3-methylisoquinoline inhibitor of Plasmodium falciparum growth targets the sodium efflux pump PfATP4.Sci Rep. 2019 Jul 16;9(1):10292. doi: 10.1038/s41598-019-46500-5. Sci Rep. 2019. PMID: 31311978 Free PMC article.
References
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources
