Identifying Suicide Ideation and Suicidal Attempts in a Psychiatric Clinical Research Database using Natural Language Processing

Andrea C Fernandes; Rina Dutta; Sumithra Velupillai; Jyoti Sanyal; Robert Stewart; David Chandran

doi:10.1038/s41598-018-25773-2

Identifying Suicide Ideation and Suicidal Attempts in a Psychiatric Clinical Research Database using Natural Language Processing

Sci Rep. 2018 May 9;8(1):7426. doi: 10.1038/s41598-018-25773-2.

Authors

Andrea C Fernandes^{1

2}, Rina Dutta^{3

4}, Sumithra Velupillai^{3

4}, Jyoti Sanyal^{3

4}, Robert Stewart^{3

4}, David Chandran^{3

4}

Affiliations

¹ Institute of Psychiatry, Psychology and Neuroscience, Academic Department of Psychological Medicine, London, SE5 8AF, United Kingdom. andrea.fernandes@kcl.ac.uk.
² UK National Institute for Health Research Biomedical Research Centre, South London and Maudsley National Health Service Foundation Trust and King's College London, London, SE5 8AZ, United Kingdom. andrea.fernandes@kcl.ac.uk.
³ Institute of Psychiatry, Psychology and Neuroscience, Academic Department of Psychological Medicine, London, SE5 8AF, United Kingdom.
⁴ UK National Institute for Health Research Biomedical Research Centre, South London and Maudsley National Health Service Foundation Trust and King's College London, London, SE5 8AZ, United Kingdom.

Abstract

Research into suicide prevention has been hampered by methodological limitations such as low sample size and recall bias. Recently, Natural Language Processing (NLP) strategies have been used with Electronic Health Records to increase information extraction from free text notes as well as structured fields concerning suicidality and this allows access to much larger cohorts than previously possible. This paper presents two novel NLP approaches - a rule-based approach to classify the presence of suicide ideation and a hybrid machine learning and rule-based approach to identify suicide attempts in a psychiatric clinical database. Good performance of the two classifiers in the evaluation study suggest they can be used to accurately detect mentions of suicide ideation and attempt within free-text documents in this psychiatric database. The novelty of the two approaches lies in the malleability of each classifier if a need to refine performance, or meet alternate classification requirements arises. The algorithms can also be adapted to fit infrastructures of other clinical datasets given sufficient clinical recording practice knowledge, without dependency on medical codes or additional data extraction of known risk factors to predict suicidal behaviour.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Data Mining / methods*
Databases, Factual*
Humans
Natural Language Processing*
Suicidal Ideation*
Suicide, Attempted / statistics & numerical data*

Grants and funding

MC_PC_17214/MRC_/Medical Research Council/United Kingdom