Natural language processing of clinical notes for identification of critical limb ischemia

Naveed Afzal; Vishnu Priya Mallipeddi; Sunghwan Sohn; Hongfang Liu; Rajeev Chaudhry; Christopher G Scott; Iftikhar J Kullo; Adelaide M Arruda-Olson

doi:10.1016/j.ijmedinf.2017.12.024

Natural language processing of clinical notes for identification of critical limb ischemia

Int J Med Inform. 2018 Mar:111:83-89. doi: 10.1016/j.ijmedinf.2017.12.024. Epub 2017 Dec 28.

Authors

Naveed Afzal¹, Vishnu Priya Mallipeddi², Sunghwan Sohn¹, Hongfang Liu¹, Rajeev Chaudhry³, Christopher G Scott¹, Iftikhar J Kullo², Adelaide M Arruda-Olson⁴

Affiliations

¹ Department of Health Sciences Research, Mayo Clinic and Mayo Foundation, Rochester, MN, United States.
² Department of Cardiovascular Diseases, Mayo Clinic and Mayo Foundation, Rochester, MN, United States.
³ Division of Primary Care Medicine, Knowledge Delivery Center and Center for Innovation, Mayo Clinic and Mayo Foundation, Rochester, MN, United States.
⁴ Department of Cardiovascular Diseases, Mayo Clinic and Mayo Foundation, Rochester, MN, United States. Electronic address: ArrudaOlson.Adelaide@mayo.edu.

Abstract

Background: Critical limb ischemia (CLI) is a complication of advanced peripheral artery disease (PAD) with diagnosis based on the presence of clinical signs and symptoms. However, automated identification of cases from electronic health records (EHRs) is challenging due to absence of a single definitive International Classification of Diseases (ICD-9 or ICD-10) code for CLI.

Methods and results: In this study, we extend a previously validated natural language processing (NLP) algorithm for PAD identification to develop and validate a subphenotyping NLP algorithm (CLI-NLP) for identification of CLI cases from clinical notes. We compared performance of the CLI-NLP algorithm with CLI-related ICD-9 billing codes. The gold standard for validation was human abstraction of clinical notes from EHRs. Compared to billing codes the CLI-NLP algorithm had higher positive predictive value (PPV) (CLI-NLP 96%, billing codes 67%, p < 0.001), specificity (CLI-NLP 98%, billing codes 74%, p < 0.001) and F1-score (CLI-NLP 90%, billing codes 76%, p < 0.001). The sensitivity of these two methods was similar (CLI-NLP 84%; billing codes 88%; p < 0.12).

Conclusions: The CLI-NLP algorithm for identification of CLI from narrative clinical notes in an EHR had excellent PPV and has potential for translation to patient care as it will enable automated identification of CLI cases for quality projects, clinical decision support tools and support a learning healthcare system.

Keywords: Critical limb ischemia; Electronic health records; Natural language processing; Peripheral artery disease; Subphenotyping.

Publication types

Research Support, N.I.H., Extramural

MeSH terms

Aged
Algorithms*
Case-Control Studies
Data Mining / methods*
Electronic Health Records*
Female
Humans
Ischemia / diagnosis*
Ischemia / etiology*
Lower Extremity / blood supply*
Male
Natural Language Processing*
Peripheral Arterial Disease / complications

Abstract

Publication types

MeSH terms

Grants and funding