Phenotype Similarity Regression for Identifying the Genetic Determinants of Rare Diseases

Am J Hum Genet. 2016 Mar 3;98(3):490-499. doi: 10.1016/j.ajhg.2016.01.008. Epub 2016 Feb 25.

Abstract

Rare genetic disorders, which can now be studied systematically with affordable genome sequencing, are often caused by high-penetrance rare variants. Such disorders are often heterogeneous and characterized by abnormalities spanning multiple organ systems ascertained with variable clinical precision. Existing methods for identifying genes with variants responsible for rare diseases summarize phenotypes with unstructured binary or quantitative variables. The Human Phenotype Ontology (HPO) allows composite phenotypes to be represented systematically but association methods accounting for the ontological relationship between HPO terms do not exist. We present a Bayesian method to model the association between an HPO-coded patient phenotype and genotype. Our method estimates the probability of an association together with an HPO-coded phenotype characteristic of the disease. We thus formalize a clinical approach to phenotyping that is lacking in standard regression techniques for rare disease research. We demonstrate the power of our method by uncovering a number of true associations in a large collection of genome-sequenced and HPO-coded cases with rare diseases.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Actinin / genetics
  • Adaptor Proteins, Signal Transducing / genetics
  • Bayes Theorem
  • Databases, Genetic
  • Formins
  • Genetic Association Studies / methods*
  • Guanine Nucleotide Exchange Factors / genetics
  • Humans
  • Logistic Models
  • Models, Genetic
  • Phenotype*
  • Rare Diseases / diagnosis*
  • Rare Diseases / genetics*

Substances

  • ACTN1 protein, human
  • Adaptor Proteins, Signal Transducing
  • DIAPH1 protein, human
  • Formins
  • Guanine Nucleotide Exchange Factors
  • RASGRP2 protein, human
  • Actinin