Investigating differences in treatment effect estimates between propensity score matching and weighting: a demonstration using STAR*D trial data
- PMID: 23280682
- PMCID: PMC3639482
- DOI: 10.1002/pds.3396
Investigating differences in treatment effect estimates between propensity score matching and weighting: a demonstration using STAR*D trial data
Abstract
Purpose: The choice of propensity score (PS) implementation influences treatment effect estimates not only because different methods estimate different quantities, but also because different estimators respond in different ways to phenomena such as treatment effect heterogeneity and limited availability of potential matches. Using effectiveness data, we describe lessons learned from sensitivity analyses with matched and weighted estimates.
Methods: With subsample data (N = 1292) from Sequenced Treatment Alternatives to Relieve Depression, a 2001-2004 effectiveness trial of depression treatments, we implemented PS matching and weighting to estimate the treatment effect in the treated and conducted multiple sensitivity analyses.
Results: Matching and weighting both balanced covariates but yielded different samples and treatment effect estimates (matched RR 1.00, 95% CI: 0.75-1.34; weighted RR 1.28, 95% CI: 0.97-1.69). In sensitivity analyses, as increasing numbers of observations at both ends of the PS distribution were excluded from the weighted analysis, weighted estimates approached the matched estimate (weighted RR 1.04, 95% CI 0.77-1.39 after excluding all observations below the 5th percentile of the treated and above the 95th percentile of the untreated). Treatment appeared to have benefits only in the highest and lowest PS strata.
Conclusions: Matched and weighted estimates differed due to incomplete matching, sensitivity of weighted estimates to extreme observations, and possibly treatment effect heterogeneity. PS analysis requires identifying the population and treatment effect of interest, selecting an appropriate implementation method, and conducting and reporting sensitivity analyses. Weighted estimation especially should include sensitivity analyses relating to influential observations, such as those treated contrary to prediction.
Copyright © 2012 John Wiley & Sons, Ltd.
Conflict of interest statement
Conflict of interest statement: In the past 5 years AE has received research funding from Merck and from the Center for Pharmacoepidemiology at the UNC Gillings School of Global Public Health, which receives industry funding. SD received funding through a Ruth L. Kirschstein-National Service Research Award Post-Doctoral Traineeship sponsored by NIMH and Harvard Medical School, Department of Health Care Policy, Grant No. T32MH01973; she has no conflict of interest to report for this paper. RH has received research support during the last 5 years from NIH and AHRQ. He also has research and consulting support from Takeda Pharmaceuticals, GlaxoSmithKline, and Novartis. Over the past 5 years, BG has received grant and research support from Agency for Healthcare Research and Quality, NIMH, Bristol Myers Squibb, Novartis, and M-3 Information. He has performed as an advisor for Bristol Myers Squibb. Over the past 5 years, JF has received unrestricted grant support from the Pfizer Foundation and consulting fees from Takeda Pharmaceuticals and Novartis Pharmaceuticals. TS receives investigator-initiated research funding and support as Principal Investigator (R01 AG023178) and Co-Investigator (R01 AG018833) from the National Institute on Aging at the National Institutes of Health. He also receives research funding as Principal Investigator of the UNC-DEcIDE center from the Agency for Healthcare Research and Quality. TS does not accept personal compensation of any kind from any pharmaceutical company, though he receives salary support from the Center for Pharmacoepidemiology and from unrestricted research grants from pharmaceutical companies to UNC.
Figures
Similar articles
-
Confounding control in a nonexperimental study of STAR*D data: logistic regression balanced covariates better than boosted CART.Ann Epidemiol. 2013 Apr;23(4):204-9. doi: 10.1016/j.annepidem.2013.01.004. Epub 2013 Feb 15. Ann Epidemiol. 2013. PMID: 23419508 Free PMC article.
-
Sequenced Treatment Alternatives to Relieve Depression (STAR*D): lessons learned.J Clin Psychiatry. 2008 Jul;69(7):1184-5. doi: 10.4088/jcp.v69n0719. J Clin Psychiatry. 2008. PMID: 18687018 No abstract available.
-
Matching on the disease risk score in comparative effectiveness research of new treatments.Pharmacoepidemiol Drug Saf. 2015 Sep;24(9):951-61. doi: 10.1002/pds.3810. Epub 2015 Jun 25. Pharmacoepidemiol Drug Saf. 2015. PMID: 26112690 Free PMC article.
-
A review of the performance of different methods for propensity score matched subgroup analyses and a summary of their application in peer-reviewed research studies.Pharmacoepidemiol Drug Saf. 2017 Dec;26(12):1507-1512. doi: 10.1002/pds.4328. Epub 2017 Oct 6. Pharmacoepidemiol Drug Saf. 2017. PMID: 28984001 Review.
-
Background and rationale for the sequenced treatment alternatives to relieve depression (STAR*D) study.Psychiatr Clin North Am. 2003 Jun;26(2):457-94, x. doi: 10.1016/s0193-953x(02)00107-7. Psychiatr Clin North Am. 2003. PMID: 12778843 Review.
Cited by
-
Vector-based kernel weighting: A simple estimator for improving precision and bias of average treatment effects in multiple treatment settings.Stat Med. 2021 Feb 28;40(5):1204-1223. doi: 10.1002/sim.8836. Epub 2020 Dec 16. Stat Med. 2021. PMID: 33327037 Free PMC article.
-
Inpatient COVID-19 outcomes in solid organ transplant recipients compared to non-solid organ transplant patients: A retrospective cohort.Am J Transplant. 2021 Jul;21(7):2498-2508. doi: 10.1111/ajt.16431. Epub 2021 Feb 21. Am J Transplant. 2021. PMID: 33284498 Free PMC article.
-
Single-arm Trials With External Comparators and Confounder Misclassification: How Adjustment Can Fail.Med Care. 2020 Dec;58(12):1116-1121. doi: 10.1097/MLR.0000000000001400. Med Care. 2020. PMID: 32925456 Free PMC article.
-
Propensity score methods to control for confounding in observational cohort studies: a statistical primer and application to endoscopy research.Gastrointest Endosc. 2019 Sep;90(3):360-369. doi: 10.1016/j.gie.2019.04.236. Epub 2019 Apr 30. Gastrointest Endosc. 2019. PMID: 31051156 Free PMC article. Review.
-
Propensity scores for confounder adjustment when assessing the effects of medical interventions using nonexperimental study designs.J Intern Med. 2014 Jun;275(6):570-80. doi: 10.1111/joim.12197. Epub 2014 Feb 13. J Intern Med. 2014. PMID: 24520806 Free PMC article.
References
-
- Shadish W, Cook T, Campbell D. Experimental and quasi-experimental designs for generalized causal inference. Boston: Houghton-Mifflin; 2002.
-
- Stürmer T, Joshi M, Glynn R, Avorn J, Rothman K, Schneeweiss S. A review of the application of propensity score methods yielded increasing use, advantages in specific settings, but not substantially different estimates compared with conventional multivariable methods. J Clin Epidemiol. 2006;59(5):437–447. - PMC - PubMed
-
- Rosenbaum P, Rubin D. The central role of the propensity score in observational studies for causal effects. Biometrika. 1983 Apr 1;70(1):41–55.
-
- Greenland S, Robins J. Identifiability, exchangeability, and epidemiological confounding. Int J Epidemiol. 1986;15(3):413–419. - PubMed
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Medical
