Purpose: Considering the non-specific nature of muscle symptoms, studies of statin-induced myopathy (SIM) in electronic health records require accurate algorithms that can reliably identify true statin-causing cases. Prior algorithms were constructed in study populations that preclude broad applicability.
Methods: Using only structured data elements of electronic health records, we developed a two-step algorithm to identify true SIM cases. Step 1 excluded potential cases that were linked to secondary (non-statin) causes. Step 2 required evidence of statin discontinuation to deem remaining potential cases as true. To validate our algorithm, we first randomly chose a subset of statin users with potential SIM. From this subset, we then identified true SIM cases using two different methods: (1) our algorithm and (2) manual chart review. Algorithm performance was assessed by comparing the algorithm-derived SIM cases to those from manual chart review, which was considered the gold standard. Metrics of algorithm performance included sensitivity and specificity. We then applied this algorithm to identify predictors of SIM.
Results: Our algorithm had 76% sensitivity, 77% specificity, and 48% positive predictive value for detecting gold standard cases. Applying our algorithm, we identified 1257 algorithm-derived SIM cases from 5430 potential cases. Pravastatin use was associated with a 2.18 odds (95% confidence interval [CI] 1.39-3.40, p = 0.0007) for SIM compared to lovastatin use after adjusting for prespecified factors. Hypothyroidism had the most dramatic increase in risk (odds ratio 15.65, 95% CI 9.12-26.83, p < 0.0001).
Conclusions: We have produced an efficient, easy-to-apply methodological tool that can improve the quality of future research on SIM.
Keywords: EHR; adverse drug reaction; creatine kinase; myotoxicity; real‐world evidence.
© 2025 The Author(s). Pharmacoepidemiology and Drug Safety published by John Wiley & Sons Ltd.