Inferring Population Structure and Admixture Proportions in Low-Depth NGS Data
- PMID: 30131346
- PMCID: PMC6216594
- DOI: 10.1534/genetics.118.301336
Inferring Population Structure and Admixture Proportions in Low-Depth NGS Data
Abstract
We here present two methods for inferring population structure and admixture proportions in low-depth next-generation sequencing (NGS) data. Inference of population structure is essential in both population genetics and association studies, and is often performed using principal component analysis (PCA) or clustering-based approaches. NGS methods provide large amounts of genetic data but are associated with statistical uncertainty, especially for low-depth sequencing data. Models can account for this uncertainty by working directly on genotype likelihoods of the unobserved genotypes. We propose a method for inferring population structure through PCA in an iterative heuristic approach of estimating individual allele frequencies, where we demonstrate improved accuracy in samples with low and variable sequencing depth for both simulated and real datasets. We also use the estimated individual allele frequencies in a fast non-negative matrix factorization method to estimate admixture proportions. Both methods have been implemented in the PCAngsd framework available at http://www.popgen.dk/software/.
Keywords: PCA; Population structure; admixture; ancestry; genotype likelihoods; low depth; next-generation sequencing.
Copyright © 2018 by the Genetics Society of America.
Figures
Similar articles
-
Estimating individual admixture proportions from next generation sequencing data.Genetics. 2013 Nov;195(3):693-702. doi: 10.1534/genetics.113.154138. Epub 2013 Sep 11. Genetics. 2013. PMID: 24026093 Free PMC article.
-
fastNGSadmix: admixture proportions and principal component analysis of a single NGS sample.Bioinformatics. 2017 Oct 1;33(19):3148-3150. doi: 10.1093/bioinformatics/btx474. Bioinformatics. 2017. PMID: 28957500
-
Fast individual ancestry inference from DNA sequence data leveraging allele frequencies for multiple populations.BMC Bioinformatics. 2015 Jan 16;16:4. doi: 10.1186/s12859-014-0418-7. BMC Bioinformatics. 2015. PMID: 25592880 Free PMC article.
-
Recent progress and challenges in population genetics of polyploid organisms: an overview of current state-of-the-art molecular and statistical tools.Mol Ecol. 2014 Jan;23(1):40-69. doi: 10.1111/mec.12581. Epub 2013 Nov 27. Mol Ecol. 2014. PMID: 24188632 Review.
-
Critical review of NGS analyses for de novo genotyping multigene families.Mol Ecol. 2014 Aug;23(16):3957-72. doi: 10.1111/mec.12843. Epub 2014 Jul 21. Mol Ecol. 2014. PMID: 24954669 Review.
Cited by
-
Redefining the Evolutionary History of the Rock Dove, Columba livia, Using Whole Genome Sequences.Mol Biol Evol. 2023 Nov 3;40(11):msad243. doi: 10.1093/molbev/msad243. Mol Biol Evol. 2023. PMID: 37950889 Free PMC article.
-
A mechanism for red coloration in vertebrates.Curr Biol. 2022 Oct 10;32(19):4201-4214.e12. doi: 10.1016/j.cub.2022.08.013. Epub 2022 Aug 31. Curr Biol. 2022. PMID: 36049480 Free PMC article.
-
Paleogenomes Reveal a Complex Evolutionary History of Late Pleistocene Bison in Northeastern China.Genes (Basel). 2022 Sep 20;13(10):1684. doi: 10.3390/genes13101684. Genes (Basel). 2022. PMID: 36292570 Free PMC article.
-
Footprints of local adaptation span hundreds of linked genes in the Atlantic silverside genome.Evol Lett. 2020 Aug 19;4(5):430-443. doi: 10.1002/evl3.189. eCollection 2020 Oct. Evol Lett. 2020. PMID: 33014419 Free PMC article.
-
Genomic evidence for domestication selection in three hatchery populations of Chinook salmon, Oncorhynchus tshawytscha.Evol Appl. 2024 Feb 14;17(2):e13656. doi: 10.1111/eva.13656. eCollection 2024 Feb. Evol Appl. 2024. PMID: 38357359 Free PMC article.
References
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources
