Background: Systems-centered root cause analysis (RCA) of patient safety events presents unique advantages as it aims to disclose vulnerabilities of healthcare systems. However, the increasing number of collected events poses the problems of low efficiency and information overload for traditional RCA.
Objectives: This study aims to improve systems-centered RCA by developing optimized information extraction and presentation.
Methods: We experimented supervised machine-learning methods to extract safety-related information from 3333 de-identified patient safety event reports from two independent sources. Based on the extracted information, we further evaluated how optimized information presentation could help facilitate the disclosure of system vulnerabilities in traditional RCA.
Results: Multilabel text classification is effective in identifying safety-related information from the narrative description of patient safety events. The Pruned Sets in conjunction with Naïve Bayes are the outperformed algorithm in one dataset, with an overall F score of 60.0 % and the highest F score of 96.0 % for identifying "Adverse Drug Reaction". The Classifier Chains in conjunction with Naïve Bayes are the outperformed algorithm in another dataset, with an overall F score of 43.2 % and the highest F score of 64.0 % for identifying "Medication". During the RCA, human experts applied the optimized presentation of information which showed advantages of identifying system vulnerabilities.
Conclusion: Our study demonstrated the feasibility of using multilabel text classification for identifying safety-related information from the narrative description of patient safety events. The extracted information when grouped by safety-related information can better aid human experts to conduct systems-centered RCA and disclose system vulnerabilities.
Keywords: Machine learning; Medical error; Patient safety; Root cause analysis; Text mining.
Published by Elsevier B.V.