Identifying Patients with Depression Using Free-text Clinical Documents

Li Zhou; Amy W Baughman; Victor J Lei; Kenneth H Lai; Amol S Navathe; Frank Chang; Margarita Sordo; Maxim Topaz; Feiran Zhong; Madhavan Murrali; Shamkant Navathe; Roberto A Rocha

Identifying Patients with Depression Using Free-text Clinical Documents

Stud Health Technol Inform. 2015:216:629-33.

Authors

Affiliations

¹ Clinical Informatics, Partners eCare, Partners Healthcare Inc. Boston, MA, USA.
² Division of General Internal Medicine and Primary Care, Brigham & Women's Hospital, Harvard Medical School, Boston, MA, USA.
³ Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA.
⁴ School of Computer Science, College of Computing, Georgia Institute of Technology, Atlanta, GA, USA.

PMID: 26262127

Abstract

About 1 in 10 adults are reported to exhibit clinical depression and the associated personal, societal, and economic costs are significant. In this study, we applied the MTERMS NLP system and machine learning classification algorithms to identify patients with depression using discharge summaries. Domain experts reviewed both the training and test cases, and classified these cases as depression with a high, intermediate, and low confidence. For depression cases with high confidence, all of the algorithms we tested performed similarly, with MTERMS' knowledge-based decision tree slightly better than the machine learning classifiers, achieving an F-measure of 89.6%. MTERMS also achieved the highest F-measure (70.6%) on intermediate confidence cases. The RIPPER rule learner was the best performing machine learning method, with an F-measure of 70.0%, and a higher precision but lower recall than MTERMS. The proposed NLP-based approach was able to identify a significant portion of the depression cases (about 20%) that were not on the coded diagnosis list.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Boston
Data Mining / methods*
Decision Support Systems, Clinical / organization & administration*
Depression / classification
Depression / diagnosis*
Diagnosis, Computer-Assisted / methods*
Electronic Health Records / classification*
Humans
Machine Learning
Natural Language Processing*
Reproducibility of Results
Sensitivity and Specificity