Objective: The purpose of this article was to determine whether longitudinal historical data, commonly available in electronic health record (EHR) systems, can be used to predict patients' future risk of suicidal behavior.
Method: Bayesian models were developed using a retrospective cohort approach. EHR data from a large health care database spanning 15 years (1998-2012) of inpatient and outpatient visits were used to predict future documented suicidal behavior (i.e., suicide attempt or death). Patients with three or more visits (N=1,728,549) were included. ICD-9-based case definition for suicidal behavior was derived by expert clinician consensus review of 2,700 narrative EHR notes (from 520 patients), supplemented by state death certificates. Model performance was evaluated retrospectively using an independent testing set.
Results: Among the study population, 1.2% (N=20,246) met the case definition for suicidal behavior. The model achieved sensitive (33%-45% sensitivity), specific (90%-95% specificity), and early (3-4 years in advance on average) prediction of patients' future suicidal behavior. The strongest predictors identified by the model included both well-known (e.g., substance abuse and psychiatric disorders) and less conventional (e.g., certain injuries and chronic conditions) risk factors, indicating that a data-driven approach can yield more comprehensive risk profiles.
Conclusions: Longitudinal EHR data, commonly available in clinical settings, can be useful for predicting future risk of suicidal behavior. This modeling approach could serve as an early warning system to help clinicians identify high-risk patients for further screening. By analyzing the full phenotypic breadth of the EHR, computerized risk screening approaches may enhance prediction beyond what is feasible for individual clinicians.
Keywords: Diagnosis And Classification; Suicide.