Introduction: Drug safety researchers seek to know the degree of certainty with which a particular drug is associated with an adverse drug reaction. There are different sources of information used in pharmacovigilance to identify, evaluate, and disseminate medical product safety evidence including spontaneous reports, published peer-reviewed literature, and product labels. Automated data processing and classification using these evidence sources can greatly reduce the manual curation currently required to develop reference sets of positive and negative controls (i.e. drugs that cause adverse drug events and those that do not) to be used in drug safety research.
Methods: In this paper we explore a method for automatically aggregating disparate sources of information together into a single repository, developing a predictive model to classify drug-adverse event relationships, and applying those predictions to a real world problem of identifying negative controls for statistical method calibration.
Results: Our results showed high predictive accuracy for the models combining all available evidence, with an area under the receiver-operator curve of ⩾0.92 when tested on three manually generated lists of drugs and conditions that are known to either have or not have an association with an adverse drug event.
Conclusions: Results from a pilot implementation of the method suggests that it is feasible to develop a scalable alternative to the time-and-resource-intensive, manual curation exercise previously applied to develop reference sets of positive and negative controls to be used in drug safety research.
Keywords: Adverse drug reaction; Health outcome; Knowledge base; Machine-learning experiment; Pharmacovigilance.
Copyright Â© 2016 Elsevier Inc. All rights reserved.