There have been a number of recent successes in the use of whole genome sequencing and sophisticated bioinformatics techniques to identify pathogenic DNA sequence variants responsible for individual idiopathic congenital conditions. However, the success of this identification process is heavily influenced by the ancestry or genetic background of a patient with an idiopathic condition. This is so because potential pathogenic variants in a patient's genome must be contrasted with variants in a reference set of genomes made up of other individuals' genomes of the same ancestry as the patient. We explored the effect of ignoring the ancestries of both an individual patient and the individuals used to construct reference genomes. We pursued this exploration in two major steps. We first considered variation in the per-genome number and rates of likely functional derived (i.e., non-ancestral, based on the chimp genome) single nucleotide variants and small indels in 52 individual whole human genomes sampled from 10 different global populations. We took advantage of a suite of computational and bioinformatics techniques to predict the functional effect of over 24 million genomic variants, both coding and non-coding, across these genomes. We found that the typical human genome harbors ∼5.5-6.1 million total derived variants, of which ∼12,000 are likely to have a functional effect (∼5000 coding and ∼7000 non-coding). We also found that the rates of functional genotypes per the total number of genotypes in individual whole genomes differ dramatically between human populations. We then created tables showing how the use of comparator or reference genome panels comprised of genomes from individuals that do not have the same ancestral background as a patient can negatively impact pathogenic variant identification. Our results have important implications for clinical sequencing initiatives.
Keywords: clinical sequencing; congenital disease; population genetics; whole genome sequencing.
Huvariome: a web server resource of whole genome next-generation sequencing allelic frequencies to aid in pathological candidate gene selection.J Clin Bioinforma. 2012 Nov 19;2(1):19. doi: 10.1186/2043-9113-2-19. J Clin Bioinforma. 2012. PMID: 23164068 Free PMC article.
Comprehensive characterization of human genome variation by high coverage whole-genome sequencing of forty four Caucasians.PLoS One. 2013;8(4):e59494. doi: 10.1371/journal.pone.0059494. Epub 2013 Apr 5. PLoS One. 2013. PMID: 23577066 Free PMC article.
Population genetic inference from personal genome data: impact of ancestry and admixture on human genomic variation.Am J Hum Genet. 2012 Oct 5;91(4):660-71. doi: 10.1016/j.ajhg.2012.08.025. Am J Hum Genet. 2012. PMID: 23040495 Free PMC article.
Impacts of variation in the human genome on gene regulation.J Mol Biol. 2013 Nov 1;425(21):3970-7. doi: 10.1016/j.jmb.2013.07.015. Epub 2013 Jul 16. J Mol Biol. 2013. PMID: 23871684 Review.
Whole genome sequencing.Methods Mol Biol. 2010;628:215-26. doi: 10.1007/978-1-60327-367-1_12. Methods Mol Biol. 2010. PMID: 20238084 Review.
Cited by 15 articles
QTL Mapping of Intestinal Neutrophil Variation in Threespine Stickleback Reveals Possible Gene Targets Connecting Intestinal Inflammation and Systemic Health.G3 (Bethesda). 2020 Feb 6;10(2):613-622. doi: 10.1534/g3.119.400685. G3 (Bethesda). 2020. PMID: 31843804 Free PMC article.
Genetic Variation in Pan Species Is Shaped by Demographic History and Harbors Lineage-Specific Functions.Genome Biol Evol. 2019 Apr 1;11(4):1178-1191. doi: 10.1093/gbe/evz047. Genome Biol Evol. 2019. PMID: 30847478 Free PMC article.
PopCluster: an algorithm to identify genetic variants with ethnicity-dependent effects.Bioinformatics. 2019 Sep 1;35(17):3046-3054. doi: 10.1093/bioinformatics/btz017. Bioinformatics. 2019. PMID: 30624692
Replication study of GWAS risk loci in Greek multiple sclerosis patients.Neurol Sci. 2019 Feb;40(2):253-260. doi: 10.1007/s10072-018-3617-6. Epub 2018 Oct 26. Neurol Sci. 2019. PMID: 30361804 Clinical Trial.
Patterns of Genetic Coding Variation in a Native American Population before and after European Contact.Am J Hum Genet. 2018 May 3;102(5):806-815. doi: 10.1016/j.ajhg.2018.03.008. Epub 2018 Apr 26. Am J Hum Genet. 2018. PMID: 29706345 Free PMC article.
- U01 HG006476/HG/NHGRI NIH HHS/United States
- R01 DA030976/DA/NIDA NIH HHS/United States
- U19 AG023122/AG/NIA NIH HHS/United States
- UL1 RR025774/RR/NCRR NIH HHS/United States
- R01 HL089655/HL/NHLBI NIH HHS/United States