Leveraging ancestry to improve causal variant identification in exome sequencing for monogenic disorders

Robert Brown; Hane Lee; Ascia Eskin; Gleb Kichaev; Kirk E Lohmueller; Bruno Reversade; Stanley F Nelson; Bogdan Pasaniuc

doi:10.1038/ejhg.2015.68

Leveraging ancestry to improve causal variant identification in exome sequencing for monogenic disorders

Eur J Hum Genet. 2016 Jan;24(1):113-9. doi: 10.1038/ejhg.2015.68. Epub 2015 Apr 22.

Authors

Robert Brown¹, Hane Lee², Ascia Eskin³, Gleb Kichaev¹, Kirk E Lohmueller^{1

4}, Bruno Reversade^{5

6

7}, Stanley F Nelson^{2

3}, Bogdan Pasaniuc^{1

2

3}

Affiliations

¹ Bioinformatics Interdepartmental Program, University of California Los Angeles, Los Angeles, CA, USA.
² Department of Pathology and Laboratory Medicine, Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA.
³ Department of Human Genetics, Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA.
⁴ Department of Ecology and Evolutionary Biology, University of California Los Angeles, Los Angeles, CA, USA.
⁵ Institute of Medical Biology, Human Genetics and Embryology Laboratory, A*STAR, Singapore, Singapore.
⁶ Institute of Molecular and Cellular Biology, A*STAR, Singapore, Singapore.
⁷ Department of Pediatrics, National University of Singapore, Singapore, Singapore.

Abstract

Recent breakthroughs in exome-sequencing technology have made possible the identification of many causal variants of monogenic disorders. Although extremely powerful when closely related individuals (eg, child and parents) are simultaneously sequenced, sequencing of a single case is often unsuccessful due to the large number of variants that need to be followed up for functional validation. Many approaches filter out common variants above a given frequency threshold (eg, 1%), and then prioritize the remaining variants according to their functional, structural and conservation properties. Here we present methods that leverage the genetic structure across different populations to improve filtering performance while accounting for the finite sample size of the reference panels. We show that leveraging genetic structure reduces the number of variants that need to be followed up by 16% in simulations and by up to 38% in empirical data of 20 exomes from individuals with monogenic disorders for which the causal variants are known.

Publication types

Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't

MeSH terms

Algorithms*
Computational Biology / methods*
Computer Simulation
Exome*
Female
Genetic Diseases, Inborn / diagnosis
Genetic Diseases, Inborn / ethnology
Genetic Diseases, Inborn / genetics*
Genetic Variation
Genome, Human
High-Throughput Nucleotide Sequencing
Humans
Inheritance Patterns
Male
Models, Statistical*
Pedigree
Polymorphism, Single Nucleotide*
Racial Groups
Sequence Analysis, DNA

Abstract

Publication types

MeSH terms

Grants and funding