A text mining approach to categorize patient safety event reports by medication error type

Sci Rep. 2023 Oct 26;13(1):18354. doi: 10.1038/s41598-023-45152-w.


Patient safety reporting systems give healthcare provider staff the ability to report medication related safety events and errors; however, many of these reports go unanalyzed and safety hazards go undetected. The objective of this study is to examine whether natural language processing can be used to better categorize medication related patient safety event reports. 3,861 medication related patient safety event reports that were previously annotated using a consolidated medication error taxonomy were used to develop three models using the following algorithms: (1) logistic regression, (2) elastic net, and (3) XGBoost. After development, models were tested, and model performance was analyzed. We found the XGBoost model performed best across all medication error categories. 'Wrong Drug', 'Wrong Dosage Form or Technique or Route', and 'Improper Dose/Dose Omission' categories performed best across the three models. In addition, we identified five words most closely associated with each medication error category and which medication error categories were most likely to co-occur. Machine learning techniques offer a semi-automated method for identifying specific medication error types from the free text of patient safety event reports. These algorithms have the potential to improve the categorization of medication related patient safety event reports which may lead to better identification of important medication safety patterns and trends.

Publication types

  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Data Mining
  • Humans
  • Logistic Models
  • Medication Errors*
  • Patient Safety*
  • Research Report