Objective: To evaluate the performance of a disproportionality design, commonly used for analysis of spontaneous reports data such as the FDA Adverse Event Reporting System database, as a potential analytical method for an adverse drug reaction risk identification system using healthcare data.
Research design: We tested the disproportionality design in 5 real observational healthcare databases and 6 simulated datasets, retrospectively studying the predictive accuracy of the method when applied to a collection of 165 positive controls and 234 negative controls across 4 outcomes: acute liver injury, acute myocardial infarction, acute kidney injury, and upper gastrointestinal bleeding.
Measures: We estimate how well the method can be expected to identify true effects and discriminate from false findings and explore the statistical properties of the estimates the design generates. The primary measure was the area under the curve (AUC) of the receiver operating characteristic (ROC) curve.
Results: For each combination of 4 outcomes and 5 databases, 48 versions of disproportionality analysis (DPA) were carried out and the AUC computed. The majority of the AUC values were in the range of 0.35 < AUC < 0.6, which is considered to be poor predictive accuracy, since the value AUC = 0.5 would be expected from mere random assignment. Several DPA versions achieved AUC of about 0.7 for the outcome Acute Renal Failure within the GE database. The overall highest DPA version across all 20 outcome-database combinations was the Bayesian Information Component method with no stratification by age and gender, using first occurrence of outcome and with assumed time-at-risk equal to duration of exposure + 30 d, but none were uniformly optimal. The relative risk estimates for the negative control drug-event combinations were very often biased either upward or downward by a factor of 2 or more. Coverage probabilities of confidence intervals from all methods were far below nominal.
Conclusions: The disproportionality methods that we evaluated did not discriminate true positives from true negatives using healthcare data as they seem to do using spontaneous report data.