Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. Sep-Oct 2013;20(5):962-8.
doi: 10.1136/amiajnl-2013-001756. Epub 2013 Jun 7.

Automated Identification of Drug and Food Allergies Entered Using Non-Standard Terminology

Free PMC article

Automated Identification of Drug and Food Allergies Entered Using Non-Standard Terminology

Richard H Epstein et al. J Am Med Inform Assoc. .
Free PMC article


Objective: An accurate computable representation of food and drug allergy is essential for safe healthcare. Our goal was to develop a high-performance, easily maintained algorithm to identify medication and food allergies and sensitivities from unstructured allergy entries in electronic health record (EHR) systems.

Materials and methods: An algorithm was developed in Transact-SQL to identify ingredients to which patients had allergies in a perioperative information management system. The algorithm used RxNorm and natural language processing techniques developed on a training set of 24 599 entries from 9445 records. Accuracy, specificity, precision, recall, and F-measure were determined for the training dataset and repeated for the testing dataset (24 857 entries from 9430 records).

Results: Accuracy, precision, recall, and F-measure for medication allergy matches were all above 98% in the training dataset and above 97% in the testing dataset for all allergy entries. Corresponding values for food allergy matches were above 97% and above 93%, respectively. Specificities of the algorithm were 90.3% and 85.0% for drug matches and 100% and 88.9% for food matches in the training and testing datasets, respectively.

Discussion: The algorithm had high performance for identification of medication and food allergies. Maintenance is practical, as updates are managed through upload of new RxNorm versions and additions to companion database tables. However, direct entry of codified allergy information by providers (through autocompleters or drop lists) is still preferred to post-hoc encoding of the data. Data tables used in the algorithm are available for download.

Conclusions: A high performing, easily maintained algorithm can successfully identify medication and food allergies from free text entries in EHR systems.

Keywords: Allergies; Electronic health records; Electronic medical records; Hypersensitivity; Natural language processing; RxNorm.


Figure 1
Figure 1
Algorithm flowchart. This flow diagram outlines the steps in the processing of entries by the Transact-SQL code in an attempt to match to an RxNorm term. The sequential steps were developed preferentially to match the most specific terms, then to match remaining entries to less specific terms. Two passes were made through the algorithm, the second bypassing steps that could not logically result in any additional matches. The second pass was applied to process entries created by splitting lines where multiple potential allergies had been entered in the allergy field (eg, several words separated by commas). The letters in parentheses in the boxes of the flowchart reference the corresponding step in table 1, which provide a description of the processing that is being done, along with examples. Pseudocode for the algorithm is provided in appendix 1.

Similar articles

See all similar articles

Cited by 9 articles

See all "Cited by" articles

Publication types