Objective: To quantify and compare the treatment effect and risk of bias of trials reporting biomarkers or intermediate outcomes (surrogate outcomes) versus trials using final patient relevant primary outcomes.
Design: Meta-epidemiological study.
Data sources: All randomised clinical trials published in 2005 and 2006 in six high impact medical journals: Annals of Internal Medicine, BMJ, Journal of the American Medical Association, Lancet, New England Journal of Medicine, and PLoS Medicine.
Study selection: Two independent reviewers selected trials.
Data extraction: Trial characteristics, risk of bias, and outcomes were recorded according to a predefined form. Two reviewers independently checked data extraction. The ratio of odds ratios was used to quantify the degree of difference in treatment effects between the trials using surrogate outcomes and those using patient relevant outcomes, also adjusted for trial characteristics. A ratio of odds ratios >1.0 implies that trials with surrogate outcomes report larger intervention effects than trials with patient relevant outcomes.
Results: 84 trials using surrogate outcomes and 101 using patient relevant outcomes were considered for analyses. Study characteristics of trials using surrogate outcomes and those using patient relevant outcomes were well balanced, except for median sample size (371 v 741) and single centre status (23% v 9%). Their risk of bias did not differ. Primary analysis showed trials reporting surrogate endpoints to have larger treatment effects (odds ratio 0.51, 95% confidence interval 0.42 to 0.60) than trials reporting patient relevant outcomes (0.76, 0.70 to 0.82), with an unadjusted ratio of odds ratios of 1.47 (1.07 to 2.01) and adjusted ratio of odds ratios of 1.46 (1.05 to 2.04). This result was consistent across sensitivity and secondary analyses.
Conclusions: Trials reporting surrogate primary outcomes are more likely to report larger treatment effects than trials reporting final patient relevant primary outcomes. This finding was not explained by differences in the risk of bias or characteristics of the two groups of trials.