Automated identification of drug and food allergies entered using non-standard terminology

J Am Med Inform Assoc. 2013 Sep-Oct;20(5):962-8. doi: 10.1136/amiajnl-2013-001756. Epub 2013 Jun 7.


Objective: An accurate computable representation of food and drug allergy is essential for safe healthcare. Our goal was to develop a high-performance, easily maintained algorithm to identify medication and food allergies and sensitivities from unstructured allergy entries in electronic health record (EHR) systems.

Materials and methods: An algorithm was developed in Transact-SQL to identify ingredients to which patients had allergies in a perioperative information management system. The algorithm used RxNorm and natural language processing techniques developed on a training set of 24 599 entries from 9445 records. Accuracy, specificity, precision, recall, and F-measure were determined for the training dataset and repeated for the testing dataset (24 857 entries from 9430 records).

Results: Accuracy, precision, recall, and F-measure for medication allergy matches were all above 98% in the training dataset and above 97% in the testing dataset for all allergy entries. Corresponding values for food allergy matches were above 97% and above 93%, respectively. Specificities of the algorithm were 90.3% and 85.0% for drug matches and 100% and 88.9% for food matches in the training and testing datasets, respectively.

Discussion: The algorithm had high performance for identification of medication and food allergies. Maintenance is practical, as updates are managed through upload of new RxNorm versions and additions to companion database tables. However, direct entry of codified allergy information by providers (through autocompleters or drop lists) is still preferred to post-hoc encoding of the data. Data tables used in the algorithm are available for download.

Conclusions: A high performing, easily maintained algorithm can successfully identify medication and food allergies from free text entries in EHR systems.

Keywords: Allergies; Electronic health records; Electronic medical records; Hypersensitivity; Natural language processing; RxNorm.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Data Mining / methods*
  • Drug Hypersensitivity*
  • Electronic Health Records*
  • Food Hypersensitivity*
  • Humans
  • Medical Records Systems, Computerized