Large-scale phenome analysis defines a behavioral signature for Huntington's disease genotype in mice

Nat Biotechnol. 2016 Aug;34(8):838-44. doi: 10.1038/nbt.3587. Epub 2016 Jul 4.


Rapid technological advances for the frequent monitoring of health parameters have raised the intriguing possibility that an individual's genotype could be predicted from phenotypic data alone. Here we used a machine learning approach to analyze the phenotypic effects of polymorphic mutations in a mouse model of Huntington's disease that determine disease presentation and age of onset. The resulting model correlated variation across 3,086 behavioral traits with seven different CAG-repeat lengths in the huntingtin gene (Htt). We selected behavioral signatures for age and CAG-repeat length that most robustly distinguished between mouse lines and validated the model by correctly predicting the repeat length of a blinded mouse line. Sufficient discriminatory power to accurately predict genotype required combined analysis of >200 phenotypic features. Our results suggest that autosomal dominant disease-causing mutations could be predicted through the use of subtle behavioral signatures that emerge in large-scale, combinatorial analyses. Our work provides an open data platform that we now share with the research community to aid efforts focused on understanding the pathways that link behavioral consequences to genetic variation in Huntington's disease.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Behavior, Animal*
  • Chromosome Mapping / methods
  • Genome / genetics*
  • Genome-Wide Association Study / methods
  • High-Throughput Nucleotide Sequencing / methods
  • Huntingtin Protein / genetics*
  • Huntington Disease / genetics*
  • Mice / classification
  • Mice / genetics*
  • Phenotype*
  • Polymorphism, Single Nucleotide / genetics


  • Htt protein, mouse
  • Huntingtin Protein