Genome-wide Association Study of Dimensional Psychopathology Using Electronic Health Records

Biol Psychiatry. 2018 Jun 15;83(12):1005-1011. doi: 10.1016/j.biopsych.2017.12.004. Epub 2018 Feb 26.


Background: Genetic studies of neuropsychiatric disease strongly suggest an overlap in liability. There are growing efforts to characterize these diseases dimensionally rather than categorically, but the extent to which such dimensional models correspond to biology is unknown.

Methods: We applied a newly developed natural language processing method to extract five symptom dimensions based on the National Institute of Mental Health Research Domain Criteria definitions from narrative hospital discharge notes in a large biobank. We conducted a genome-wide association study to examine whether common variants were associated with each of these dimensions as quantitative traits.

Results: Among 4687 individuals, loci in three of five domains exceeded a genome-wide threshold for statistical significance. These included a locus spanning the neocortical development genes RFPL3 and RFPL3S for arousal (p = 2.29 × 10-8) and one spanning the FPR3 gene for cognition (p = 3.22 × 10-8).

Conclusions: Natural language processing identifies dimensional phenotypes that may facilitate the discovery of common genetic variation that is relevant to psychopathology.

Keywords: Arousal; Genetic; Genomic; Social; Transdiagnostic; Valence.

Publication types

  • Meta-Analysis
  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Arousal / genetics*
  • Carrier Proteins / genetics*
  • Cognition / physiology*
  • Cohort Studies
  • Electronic Health Records / statistics & numerical data
  • Female
  • Genome-Wide Association Study*
  • Genotype
  • Hospitalization
  • Humans
  • Male
  • Natural Language Processing
  • Psychopathology
  • Receptors, Formyl Peptide / genetics*
  • Ubiquitin-Protein Ligases / genetics*


  • Carrier Proteins
  • FPR3 protein, human
  • RFPL3 protein, human
  • Receptors, Formyl Peptide
  • RFPL4A protein, human
  • Ubiquitin-Protein Ligases