Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2013 Oct;48(5):1798-817.
doi: 10.1111/1475-6773.12068. Epub 2013 May 23.

An empirical comparison of tree-based methods for propensity score estimation

Affiliations
Comparative Study

An empirical comparison of tree-based methods for propensity score estimation

Stephanie Watkins et al. Health Serv Res. 2013 Oct.

Abstract

Objective: To illustrate the use of ensemble tree-based methods (random forest classification [RFC] and bagging) for propensity score estimation and to compare these methods with logistic regression, in the context of evaluating the effect of physical and occupational therapy on preschool motor ability among very low birth weight (VLBW) children.

Data source: We used secondary data from the Early Childhood Longitudinal Study Birth Cohort (ECLS-B) between 2001 and 2006.

Study design: We estimated the predicted probability of treatment using tree-based methods and logistic regression (LR). We then modeled the exposure-outcome relation using weighted LR models while considering covariate balance and precision for each propensity score estimation method.

Principal findings: Among approximately 500 VLBW children, therapy receipt was associated with moderately improved preschool motor ability. Overall, ensemble methods produced the best covariate balance (Mean Squared Difference: 0.03-0.07) and the most precise effect estimates compared to LR (Mean Squared Difference: 0.11). The overall magnitude of the effect estimates was similar between RFC and LR estimation methods.

Conclusion: Propensity score estimation using RFC and bagging produced better covariate balance with increased precision compared to LR. Ensemble methods are a useful alterative to logistic regression to control confounding in observational studies.

Keywords: Propensity scores; ensemble methods; tree-based methods.

PubMed Disclaimer

Similar articles

Cited by

References

    1. American PsychiatricAssociation. Diagnostic and Statistical Manual of Mental Disorders. Washington, DC: American Psychiatric Association; 2000.
    1. Austin PC. “Propensity-Score Matching in the Cardiovascular Surgery Literature from 2004 to 2006: A Systematic Review and Suggestions for Improvement”. Journal of Thoracic and Cardiovascular Surgery. 2007;134(5):1128–35. - PubMed
    1. Austin PC. “An Introduction to Propensity Score Methods for Reducing the Effects of Confounding in Observational Studies”. Multivariate Behavioral Research. 2011;46(3):399–424. - PMC - PubMed
    1. Austin PC, Mamdani MM. “A Comparison of Propensity Score Methods: A Case-Study Estimating the Effectiveness of Post-AMI Statin Use”. Statistics in Medicine. 2006;25(12):2084–106. - PubMed
    1. Bang H, Robins JM. “Doubly Robust Estimation in Missing Data and Causal Inference Models”. Biometrics. 2005;61(4):962–73. - PubMed

Publication types

LinkOut - more resources