A Natural Language Processing Algorithm for Classifying Suicidal Behaviors in Alzheimer's Disease and Related Dementia Patients: Development and Validation Using Electronic Health Records Data

medRxiv [Preprint]. 2023 Jul 24:2023.07.21.23292976. doi: 10.1101/2023.07.21.23292976.


This study aimed to develop a natural language processing algorithm (NLP) using machine learning (ML) and Deep Learning (DL) techniques to identify and classify documentation of suicidal behaviors in patients with Alzheimer's disease and related dementia (ADRD). We utilized MIMIC-III and MIMIC-IV datasets and identified ADRD patients and subsequently those with suicide ideation using relevant International Classification of Diseases (ICD) codes. We used cosine similarity with ScAN (Suicide Attempt and Ideation Events Dataset) to calculate semantic similarity scores of ScAN with extracted notes from MIMIC for the clinical notes. The notes were sorted based on these scores, and manual review and categorization into eight suicidal behavior categories were performed. The data were further analyzed using conventional ML and DL models, with manual annotation as a reference. The tested classifiers achieved classification results close to human performance with up to 98% precision and 98% recall of suicidal ideation in the ADRD patient population. Our NLP model effectively reproduced human annotation of suicidal ideation within the MIMIC dataset. These results establish a foundation for identifying and categorizing documentation related to suicidal ideation within ADRD population, contributing to the advancement of NLP techniques in healthcare for extracting and classifying clinical concepts, particularly focusing on suicidal ideation among patients with ADRD. Our study showcased the capability of a robust NLP algorithm to accurately identify and classify documentation of suicidal behaviors in ADRD patients.

Publication types

  • Preprint