Background: Propensity score (PS) methods are increasingly used, even when sample sizes are small or treatments are seldom used. However, the relative performance of the two mainly recommended PS methods, namely PS-matching or inverse probability of treatment weighting (IPTW), have not been studied in the context of small sample sizes.
Methods: We conducted a series of Monte Carlo simulations to evaluate the influence of sample size, prevalence of treatment exposure, and strength of the association between the variables and the outcome and/or the treatment exposure, on the performance of these two methods.
Results: Decreasing the sample size from 1,000 to 40 subjects did not substantially alter the Type I error rate, and led to relative biases below 10%. The IPTW method performed better than the PS-matching down to 60 subjects. When N was set at 40, the PS matching estimators were either similarly or even less biased than the IPTW estimators. Including variables unrelated to the exposure but related to the outcome in the PS model decreased the bias and the variance as compared to models omitting such variables. Excluding the true confounder from the PS model resulted, whatever the method used, in a significantly biased estimation of treatment effect. These results were illustrated in a real dataset.
Conclusion: Even in case of small study samples or low prevalence of treatment, PS-matching and IPTW can yield correct estimations of treatment effect unless the true confounders and the variables related only to the outcome are not included in the PS model.