Background: The absolute risk reduction (ARR) in cardiovascular events from therapy is generally assumed to be proportional to baseline risk-such that high-risk patients benefit most. Yet newer analyses have proposed using randomized trial data to develop models that estimate individual treatment effects. We tested 2 hypotheses: first, that models of individual treatment effects would reveal that benefit from intensive blood pressure therapy is proportional to baseline risk; and second, that a machine learning approach designed to predict heterogeneous treatment effects-the X-learner meta-algorithm-is equivalent to a conventional logistic regression approach.
Methods and results: We compared conventional logistic regression to the X-learner approach for prediction of 3-year cardiovascular disease event risk reduction from intensive (target systolic blood pressure <120 mm Hg) versus standard (target <140 mm Hg) blood pressure treatment, using individual participant data from the SPRINT (Systolic Blood Pressure Intervention Trial; N=9361) and ACCORD BP (Action to Control Cardiovascular Risk in Diabetes Blood Pressure; N=4733) trials. Each model incorporated 17 covariates, an indicator for treatment arm, and interaction terms between covariates and treatment. Logistic regression had lower C statistic for benefit than the X-learner (0.51 [95% CI, 0.49-0.53] versus 0.60 [95% CI, 0.58-0.63], respectively). Following the logistic regression's recommendation for individualized therapy produced restricted mean time until cardiovascular disease event of 1065.47 days (95% CI, 1061.04-1069.35), while following the X-learner's recommendation improved mean time until cardiovascular disease event to 1068.71 days (95% CI, 1065.42-1072.08). Calibration was worse for logistic regression; it over-estimated ARR attributable to intensive treatment (slope between predicted and observed ARR of 0.73 [95% CI, 0.30-1.14] versus 1.06 [95% CI, 0.74-1.32] for the X-learner, compared with the ideal of 1). Predicted ARRs using logistic regression were generally proportional to baseline pretreatment cardiovascular risk, whereas the X-learner observed-correctly-that individual treatment effects were often not proportional to baseline risk.
Conclusions: Predictions for individual treatment effects from trial data reveal that patients may experience ARRs not simply proportional to baseline cardiovascular disease risk. Machine learning methods may improve discrimination and calibration of individualized treatment effect estimates from clinical trial data.
Keywords: blood pressure; calibration; cardiovascular disease; machine learning; risk factors.