A taxonomy for advancing systematic error analysis in multi-site electronic health record-based clinical concept extraction

Sunyang Fu; Liwei Wang; Huan He; Andrew Wen; Nansu Zong; Anamika Kumari; Feifan Liu; Sicheng Zhou; Rui Zhang; Chenyu Li; Yanshan Wang; Jennifer St Sauver; Hongfang Liu; Sunghwan Sohn

doi:10.1093/jamia/ocae101

A taxonomy for advancing systematic error analysis in multi-site electronic health record-based clinical concept extraction

J Am Med Inform Assoc. 2024 May 14:ocae101. doi: 10.1093/jamia/ocae101. Online ahead of print.

Authors

Sunyang Fu^{1

2}, Liwei Wang^{1

2}, Huan He³, Andrew Wen², Nansu Zong¹, Anamika Kumari⁴, Feifan Liu⁴, Sicheng Zhou⁵, Rui Zhang⁵, Chenyu Li⁶, Yanshan Wang⁶, Jennifer St Sauver⁷, Hongfang Liu^{1

2}, Sunghwan Sohn¹

Affiliations

¹ Department of AI and Informatics, Mayo Clinic, Rochester, MN 55902, United States.
² Center for Translational AI Excellence and Applications in Medicine, University of Texas Health Science Center at Houston, Houston, TX 77030, United States.
³ Department of Biomedical Informatics & Data Science, Yale University, New Haven, CT 06520, United States.
⁴ Department of Population and Quantitative Health Sciences, University of Massachusetts Chan Medical School, Boston, MA 01655, United States.
⁵ Division of Computational Health Sciences, University of Minnesota, Minneapolis, MN 55455, United States.
⁶ Department of Health Information Management, University of Pittsburgh, Pittsburgh, PA 15260, United States.
⁷ Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN 55902, United States.

PMID: 38742455
DOI: 10.1093/jamia/ocae101

Abstract

Background: Error analysis plays a crucial role in clinical concept extraction, a fundamental subtask within clinical natural language processing (NLP). The process typically involves a manual review of error types, such as contextual and linguistic factors contributing to their occurrence, and the identification of underlying causes to refine the NLP model and improve its performance. Conducting error analysis can be complex, requiring a combination of NLP expertise and domain-specific knowledge. Due to the high heterogeneity of electronic health record (EHR) settings across different institutions, challenges may arise when attempting to standardize and reproduce the error analysis process.

Objectives: This study aims to facilitate a collaborative effort to establish common definitions and taxonomies for capturing diverse error types, fostering community consensus on error analysis for clinical concept extraction tasks.

Materials and methods: We iteratively developed and evaluated an error taxonomy based on existing literature, standards, real-world data, multisite case evaluations, and community feedback. The finalized taxonomy was released in both .dtd and .owl formats at the Open Health Natural Language Processing Consortium. The taxonomy is compatible with several different open-source annotation tools, including MAE, Brat, and MedTator.

Results: The resulting error taxonomy comprises 43 distinct error classes, organized into 6 error dimensions and 4 properties, including model type (symbolic and statistical machine learning), evaluation subject (model and human), evaluation level (patient, document, sentence, and concept), and annotation examples. Internal and external evaluations revealed strong variations in error types across methodological approaches, tasks, and EHR settings. Key points emerged from community feedback, including the need to enhancing clarity, generalizability, and usability of the taxonomy, along with dissemination strategies.

Conclusion: The proposed taxonomy can facilitate the acceleration and standardization of the error analysis process in multi-site settings, thus improving the provenance, interpretability, and portability of NLP models. Future researchers could explore the potential direction of developing automated or semi-automated methods to assist in the classification and standardization of error analysis.

Keywords: electronic health record; error analysis; natural language processing.

Abstract

Grants and funding