Use of a Machine Learning Algorithm to Predict Individuals with Suicide Ideation in the General Population

Psychiatry Investig. 2018 Nov;15(11):1030-1036. doi: 10.30773/pi.2018.08.27. Epub 2018 Oct 11.


Objective: In this study, we aimed to develop a model predicting individuals with suicide ideation within a general population using a machine learning algorithm.

Methods: Among 35,116 individuals aged over 19 years from the Korea National Health & Nutrition Examination Survey, we selected 11,628 individuals via random down-sampling. This included 5,814 suicide ideators and the same number of non-suicide ideators. We randomly assigned the subjects to a training set (n=10,466) and a test set (n=1,162). In the training set, a random forest model was trained with 15 features selected with recursive feature elimination via 10-fold cross validation. Subsequently, the fitted model was used to predict suicide ideators in the test set and among the total of 35,116 subjects. All analyses were conducted in R.

Results: The prediction model achieved a good performance [area under receiver operating characteristic curve (AUC)=0.85] in the test set and predicted suicide ideators among the total samples with an accuracy of 0.821, sensitivity of 0.836, and specificity of 0.807.

Conclusion: This study shows the possibility that a machine learning approach can enable screening for suicide risk in the general population. Further work is warranted to increase the accuracy of prediction.

Keywords: Machine learning algorithm; Prediction; Public health data; Suicide ideation.