A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data
- PMID: 21903627
- PMCID: PMC3198575
- DOI: 10.1093/bioinformatics/btr509
A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data
Abstract
Motivation: Most existing methods for DNA sequence analysis rely on accurate sequences or genotypes. However, in applications of the next-generation sequencing (NGS), accurate genotypes may not be easily obtained (e.g. multi-sample low-coverage sequencing or somatic mutation discovery). These applications press for the development of new methods for analyzing sequence data with uncertainty.
Results: We present a statistical framework for calling SNPs, discovering somatic mutations, inferring population genetical parameters and performing association tests directly based on sequencing data without explicit genotyping or linkage-based imputation. On real data, we demonstrate that our method achieves comparable accuracy to alternative methods for estimating site allele count, for inferring allele frequency spectrum and for association mapping. We also highlight the necessity of using symmetric datasets for finding somatic mutations and confirm that for discovering rare events, mismapping is frequently the leading source of errors.
Availability: http://samtools.sourceforge.net.
Contact: hengli@broadinstitute.org.
Figures
Similar articles
-
Estimation of allele frequency and association mapping using next-generation sequencing data.BMC Bioinformatics. 2011 Jun 11;12:231. doi: 10.1186/1471-2105-12-231. BMC Bioinformatics. 2011. PMID: 21663684 Free PMC article.
-
Genotype and SNP calling from next-generation sequencing data.Nat Rev Genet. 2011 Jun;12(6):443-51. doi: 10.1038/nrg2986. Nat Rev Genet. 2011. PMID: 21587300 Free PMC article. Review.
-
SNVSniffer: an integrated caller for germline and somatic single-nucleotide and indel mutations.BMC Syst Biol. 2016 Aug 1;10 Suppl 2(Suppl 2):47. doi: 10.1186/s12918-016-0300-5. BMC Syst Biol. 2016. PMID: 27489955 Free PMC article.
-
SNP calling by sequencing pooled samples.BMC Bioinformatics. 2012 Sep 20;13:239. doi: 10.1186/1471-2105-13-239. BMC Bioinformatics. 2012. PMID: 22992255 Free PMC article.
-
Recent progress and challenges in population genetics of polyploid organisms: an overview of current state-of-the-art molecular and statistical tools.Mol Ecol. 2014 Jan;23(1):40-69. doi: 10.1111/mec.12581. Epub 2013 Nov 27. Mol Ecol. 2014. PMID: 24188632 Review.
Cited by
-
Sequence-based ultra-dense genetic and physical maps reveal structural variations of allopolyploid cotton genomes.Genome Biol. 2015 May 24;16(1):108. doi: 10.1186/s13059-015-0678-1. Genome Biol. 2015. PMID: 26003111 Free PMC article.
-
Multimodally profiling memory T cells from a tuberculosis cohort identifies cell state associations with demographics, environment and disease.Nat Immunol. 2021 Jun;22(6):781-793. doi: 10.1038/s41590-021-00933-1. Epub 2021 May 24. Nat Immunol. 2021. PMID: 34031617 Free PMC article.
-
Shimmer: detection of genetic alterations in tumors using next-generation sequence data.Bioinformatics. 2013 Jun 15;29(12):1498-503. doi: 10.1093/bioinformatics/btt183. Epub 2013 Apr 24. Bioinformatics. 2013. PMID: 23620360 Free PMC article.
-
A genome assembly and transcriptome atlas of the inbred Babraham pig to illuminate porcine immunogenetic variation.Immunogenetics. 2024 Dec;76(5-6):361-380. doi: 10.1007/s00251-024-01355-7. Epub 2024 Sep 19. Immunogenetics. 2024. PMID: 39294478 Free PMC article.
-
Polygenic risk scores for cigarettes smoked per day do not generalize to a Native American population.Drug Alcohol Depend. 2016 Oct 1;167:95-102. doi: 10.1016/j.drugalcdep.2016.07.029. Epub 2016 Aug 10. Drug Alcohol Depend. 2016. PMID: 27530288 Free PMC article.
References
-
- Brent RP. Algorithms for Minimization without Derivatives. Englewood Cliffs, New Jersey: Prentice-Hall; 1973.
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
