Background: The proliferation of genetic profiling has revealed many associations between genetic variations and disease. However, large-scale phenotyping efforts in largely healthy populations, coupled with DNA sequencing, suggest variants currently annotated as pathogenic are more common in healthy populations than previously thought. In addition, novel and rare variants are frequently observed in genes associated with disease both in healthy individuals and those under suspicion of disease. This raises the question of whether these variants can be useful predictors of disease. To answer this question, we assessed the degree to which the presence of a variant in the cardiac potassium channel gene KCNH2 was diagnostically predictive for the autosomal dominant long QT syndrome.
Methods: We estimated the probability of a long QT diagnosis given the presence of each KCNH2 variant using Bayesian methods that incorporated variant features such as changes in variant function, protein structure, and in silico predictions. We call this estimate the posttest probability of disease. Our method was applied to over 4000 individuals heterozygous for 871 missense or in-frame insertion/deletion variants in KCNH2 and validated against a separate international cohort of 933 individuals heterozygous for 266 missense or in-frame insertion/deletion variants.
Results: Our method was well-calibrated for the observed fraction of heterozygotes diagnosed with long QT syndrome. Heuristically, we found that the innate diagnostic information one learns about a variant from 3-dimensional variant location, in vitro functional data, and in silico predictors is equivalent to the diagnostic information one learns about that same variant by clinically phenotyping 10 heterozygotes. Most importantly, these data can be obtained in the absence of any clinical observations.
Conclusions: We show how variant-specific features can inform a prior probability of disease for rare variants even in the absence of clinically phenotyped heterozygotes.
Keywords: genetic variation; heterozygotes; ion channel; long QT syndrome; phenotype.