Is the R coefficient of interest in cluster randomized trials with a binary outcome?

Stat Methods Med Res. 2020 Sep;29(9):2470-2480. doi: 10.1177/0962280219900200. Epub 2020 Jan 23.


In cluster randomized trials, the intraclass correlation coefficient (ICC) is classically used to measure clustering. When the outcome is binary, the ICC is known to be associated with the prevalence of the outcome. This association challenges its interpretation and can be problematic for sample size calculation. To overcome these situations, Crespi et al. extended a coefficient named R, initially proposed by Rosner for ophthalmologic data, to cluster randomized trials. Crespi et al. asserted that R may be less influenced by the outcome prevalence than is the ICC, although the authors provided only empirical data to support their assertion. They also asserted that "the traditional ICC approach to sample size determination tends to overpower studies under many scenarios, calling for more clusters than truly required", although they did not consider empirical power. The aim of this study was to investigate whether R could indeed be considered independent of the outcome prevalence. We also considered whether sample size calculation should be better based on the R coefficient or the ICC. Considering the particular case of 2 individuals per cluster, we theoretically demonstrated that R is not symmetrical around the 0.5 prevalence value. This in itself demonstrates the dependence of R on prevalence. We also conducted a simulation study to explore the case of both fixed and variable cluster sizes greater than 2. This simulation study demonstrated that R decreases when prevalence increases from 0 to 1. Both the analytical and simulation results demonstrate that R depends on the outcome prevalence. In terms of sample size calculation, we showed that an approach based on the ICC is preferable to an approach based on the R coefficient because with the former, the empirical power is closer to the nominal one. Hence, the R coefficient does not outperform the ICC for binary outcomes because it does not offer any advantage over the ICC.

Keywords: Intraclass correlation coefficient; R coefficient; binary outcome; cluster; prevalence.

MeSH terms

  • Cluster Analysis
  • Computer Simulation
  • Humans
  • Prevalence
  • Randomized Controlled Trials as Topic
  • Research Design*
  • Sample Size