Development and Validation of a Deep Learning Model for Detection of Allergic Reactions Using Safety Event Reports Across Hospitals

JAMA Netw Open. 2020 Nov 2;3(11):e2022836. doi: 10.1001/jamanetworkopen.2020.22836.


Importance: Although critical to patient safety, health care-related allergic reactions are challenging to identify and monitor.

Objective: To develop a deep learning model to identify allergic reactions in the free-text narrative of hospital safety reports and evaluate its generalizability, efficiency, productivity, and interpretability.

Design, setting, and participants: This cross-sectional study analyzed hospital safety reports filed between May 2004 and January 2019 at Brigham and Women's Hospital and between April 2006 and June 2018 at Massachusetts General Hospital in Boston. Training and validating a deep learning model involved extracting safety reports using 101 expert-curated keywords from Massachusetts General Hospital (data set I). The model was then evaluated on 3 data sets: reports without keywords (data set II), reports from a different time frame (data set III), and reports from a different hospital (Brigham and Women's Hospital; data set IV). Statistical analyses were performed between March 1, 2019, and July 18, 2020.

Main outcomes and measures: The area under the receiver operating characteristic curve and area under the precision-recall curve were used on data set I. The precision at top-k was used on data sets II to IV.

Results: A total of 299 028 safety reports with 172 854 patients were included. Of these patients, 86 544 were women (50.1%) and the median (interquartile range [IQR]) age was 59.7 (43.8-71.6) years. The deep learning model achieved an area under the receiver operating characteristic curve of 0.979 (95% CI, 0.973-0.985) and an area under the precision-recall curve of 0.809 (95% CI, 0.773-0.845). The model achieved precisions at the top 100 model-identified cases of 0.930 in data set II, 0.960 in data set III, and 0.990 in data set IV. Compared with the keyword-search approach, the deep learning model reduced the number of cases for manual review by 63.8% and identified 24.2% more cases of confirmed allergic reactions. The model highlighted important words (eg, rash, hives, and Benadryl) in prediction and extended the list of expert-curated keywords through an attention layer.

Conclusions and relevance: This study showed that a deep learning model can accurately and efficiently identify allergic reactions using free-text narratives written by a variety of health care professionals. This model could be used to improve allergy care, potentially enabling real-time event surveillance and guidance for medical errors and system improvement.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Adult
  • Aged
  • Algorithms*
  • Boston
  • Cross-Sectional Studies
  • Deep Learning*
  • Diagnosis, Computer-Assisted / methods*
  • Female
  • Humans
  • Hypersensitivity / diagnosis*
  • Male
  • Middle Aged
  • Patient Safety / statistics & numerical data*
  • ROC Curve
  • Reproducibility of Results