Assessment of machine learning algorithms in national data to classify the risk of self-harm among young adults in hospital: A retrospective study

Int J Med Inform. 2023 Sep:177:105164. doi: 10.1016/j.ijmedinf.2023.105164. Epub 2023 Jul 25.

Abstract

Background: Self-harm is one of the most common presentations at accident and emergency departments in the UK and is a strong predictor of suicide risk. The UK Government has prioritised identifying risk factors and developing preventative strategies for self-harm. Machine learning offers a potential method to identify complex patterns with predictive value for the risk of self-harm.

Methods: National data in the UK Mental Health Services Data Set were isolated for patients aged 18-30 years who started a mental health hospital admission between Aug 1, 2020 and Aug 1, 2021, and had been discharged by Jan 1, 2022. Data were obtained on age group, gender, ethnicity, employment status, marital status, accommodation status and source of admission to hospital and used to construct seven machine learning models that were used individually and as an ensemble to predict hospital stays that would be associated with a risk of self-harm.

Outcomes: The training dataset included 23 808 items (including 1081 episodes of self-harm) and the testing dataset 5951 items (including 270 episodes of self-harm). The best performing algorithms were the random forest model (AUC-ROC 0.70, 95%CI:0.66-0.74) and the ensemble model (AUC-ROC 0.77 95%CI:0.75-0.79).

Interpretation: Machine learning algorithms could predict hospital stays with a high risk of self-harm based on readily available data that are routinely collected by health providers and recorded in the Mental Health Services Data Set. The findings should be validated externally with other real-world, prospective data.

Funding: This study was supported by the Midlands and Lancashire Commissioning Support Unit.

Keywords: Algorithmic bias; Artificial intelligence; Deep learning; Generalisability; Neural networks; Psychiatry; Risk stratification; Statistical models.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Hospitals
  • Humans
  • Machine Learning
  • Prospective Studies
  • Retrospective Studies
  • Risk Assessment
  • Self-Injurious Behavior* / diagnosis
  • Self-Injurious Behavior* / epidemiology
  • Self-Injurious Behavior* / psychology
  • Young Adult