Research Domain Criteria scores estimated through natural language processing are associated with risk for suicide and accidental death

Depress Anxiety. 2019 May;36(5):392-399. doi: 10.1002/da.22882. Epub 2019 Feb 2.


Background: Identification of individuals at increased risk for suicide is an important public health priority, but the extent to which considering clinical phenomenology improves prediction of longer term outcomes remains understudied. Hospital discharge provides an opportunity to stratify risk using readily available clinical records and details.

Methods: We applied a validated natural language processing tool to generate estimated Research Domain Criteria (RDoC) scores for a cohort of 444,317 individuals drawn from 815,457 hospital discharges between 2005 and 2013. We used survival analysis to examine the association of this risk with suicide and accidental death, adjusted for sociodemographic features.

Results: In adjusted models, symptoms in each of the five domains contributed to incremental risk (log rank P < 0.001), with greatest increase observed with positive valence. The contribution of each domain to risk was time dependent.

Conclusions: RDoC symptom scores parsed from clinical documentation are associated with suicide and illustrates that multiple domains contribute to risk in a time-varying fashion.

Keywords: Research Domain Criteria; accidental death; electronic health records; natural language processing; suicide; survival analysis.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adult
  • Behavioral Symptoms*
  • Cohort Studies
  • Death*
  • Electronic Health Records*
  • Female
  • Humans
  • Male
  • Middle Aged
  • Natural Language Processing*
  • Risk
  • Risk Assessment / statistics & numerical data*
  • Suicide / statistics & numerical data*
  • Survival Analysis