Parametric and nonparametric propensity score estimation in multilevel observational studies

Marie Salditt; Steffen Nestler

doi:10.1002/sim.9852

Parametric and nonparametric propensity score estimation in multilevel observational studies

Stat Med. 2023 Oct 15;42(23):4147-4176. doi: 10.1002/sim.9852. Epub 2023 Aug 2.

Authors

Marie Salditt¹, Steffen Nestler¹

Affiliation

¹ Institute of Psychology, University of Münster, Münster, Germany.

PMID: 37532119
DOI: 10.1002/sim.9852

Abstract

There has been growing interest in using nonparametric machine learning approaches for propensity score estimation in order to foster robustness against misspecification of the propensity score model. However, the vast majority of studies focused on single-level data settings, and research on nonparametric propensity score estimation in clustered data settings is scarce. In this article, we extend existing research by describing a general algorithm for incorporating random effects into a machine learning model, which we implemented for generalized boosted modeling (GBM). In a simulation study, we investigated the performance of logistic regression, GBM, and Bayesian additive regression trees for inverse probability of treatment weighting (IPW) when the data are clustered, the treatment exposure mechanism is nonlinear, and unmeasured cluster-level confounding is present. For each approach, we compared fixed and random effects propensity score models to single-level models and evaluated their use in both marginal and clustered IPW. We additionally investigated the performance of the standard Super Learner and the balance Super Learner. The results showed that when there was no unmeasured confounding, logistic regression resulted in moderate bias in both marginal and clustered IPW, whereas the nonparametric approaches were unbiased. In presence of cluster-level confounding, fixed and random effects models greatly reduced bias compared to single-level models in marginal IPW, with fixed effects GBM and fixed effects logistic regression performing best. Finally, clustered IPW was overall preferable to marginal IPW and the balance Super Learner outperformed the standard Super Learner, though neither worked as well as their best candidate model.

Keywords: Super Learner; clustering; machine learning; observational studies; propensity score weighting.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Bayes Theorem
Bias
Computer Simulation
Humans
Logistic Models
Multilevel Analysis*
Observational Studies as Topic*
Propensity Score*