Propensity scores (PS) are an increasingly popular method to adjust for confounding in observational studies. Propensity score methods have theoretical advantages over conventional covariate adjustment, but their relative performance in real-word scenarios is poorly characterized. We used datasets from 4 large-scale cardiovascular observational studies (PROMETHEUS, ADAPT-DES [the Assessment of Dual AntiPlatelet Therapy with Drug-Eluting Stents], THIN [The Health Improvement Network], and CHARM [Candesartan in Heart Failure-Assessment of Reduction in Mortality and Morbidity]) to compare the performance of conventional covariate adjustment with 4 common PS methods: matching, stratification, inverse probability weighting, and use of PS as a covariate. We found that stratification performed poorly with few outcome events, and inverse probability weighting gave imprecise estimates of treatment effect and undue influence to a small number of observations when substantial confounding was present. Covariate adjustment and matching performed well in all of our examples, although matching tended to give less precise estimates in some cases. PS methods are not necessarily superior to conventional covariate adjustment, and care should be taken to select the most suitable method.
Keywords: bias; comparison of methods; observational studies; regression.
Copyright © 2017 American College of Cardiology Foundation. Published by Elsevier Inc. All rights reserved.