Small sample performance of bias-corrected sandwich estimators for cluster-randomized trials with binary outcomes

Stat Med. 2015 Jan 30;34(2):281-96. doi: 10.1002/sim.6344. Epub 2014 Oct 24.


The sandwich estimator in generalized estimating equations (GEE) approach underestimates the true variance in small samples and consequently results in inflated type I error rates in hypothesis testing. This fact limits the application of the GEE in cluster-randomized trials (CRTs) with few clusters. Under various CRT scenarios with correlated binary outcomes, we evaluate the small sample properties of the GEE Wald tests using bias-corrected sandwich estimators. Our results suggest that the GEE Wald z-test should be avoided in the analyses of CRTs with few clusters even when bias-corrected sandwich estimators are used. With t-distribution approximation, the Kauermann and Carroll (KC)-correction can keep the test size to nominal levels even when the number of clusters is as low as 10 and is robust to the moderate variation of the cluster sizes. However, in cases with large variations in cluster sizes, the Fay and Graubard (FG)-correction should be used instead. Furthermore, we derive a formula to calculate the power and minimum total number of clusters one needs using the t-test and KC-correction for the CRTs with binary outcomes. The power levels as predicted by the proposed formula agree well with the empirical powers from the simulations. The proposed methods are illustrated using real CRT data. We conclude that with appropriate control of type I error rates under small sample sizes, we recommend the use of GEE approach in CRTs with binary outcomes because of fewer assumptions and robustness to the misspecification of the covariance structure.

Keywords: correlated data; generalized estimating equations (GEE); power; sample size; type I error rates.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Bias
  • Breast Neoplasms / diagnosis
  • Computer Simulation
  • Early Detection of Cancer / statistics & numerical data
  • Female
  • General Practice / methods
  • General Practice / statistics & numerical data
  • Health Services Research / methods
  • Health Services Research / statistics & numerical data*
  • Humans
  • London
  • Monte Carlo Method
  • Outcome Assessment, Health Care / methods
  • Outcome Assessment, Health Care / statistics & numerical data*
  • Patient Acceptance of Health Care / statistics & numerical data*
  • Randomized Controlled Trials as Topic / methods
  • Randomized Controlled Trials as Topic / statistics & numerical data*
  • Sample Size*