Intraclass correlation coefficient and outcome prevalence are associated in clustered binary data

J Clin Epidemiol. 2005 Mar;58(3):246-51. doi: 10.1016/j.jclinepi.2004.08.012.


Background and objective: To describe the association between values for a proportion and the intraclass correlation coefficient (ICC).

Methods: Analysis of data obtained from the General Practice Research Database (GPRD) for variation between United Kingdom general practices and results from a Health Technology Assessment (HTA) review for a range of outcomes in community and health services settings.

Results: There were 188 ICCs from the GPRD, the median prevalence was 13.1% (interquartile range IQR 3.5 to 28.4%) and median ICC 0.051 (IQR 0.011 to 0.094). There were 136 ICCs from the HTA review, with median prevalence 6.5% (IQR 0.4 to 20.7%) and median ICC 0.006 (IQR 0.0003 to 0.036). There was a linear association of log ICC with log prevalence in both datasets (GPRD, regression coefficient 0.61, 95% confidence interval 0.53 to 0.69, P < 0.001; HTA, 0.91, 0.81 to 1.01, P < 0.001). When the prevalence was 1% the predicted ICC was 0.008 from the GPRD or 0.002 from the HTA, but when the prevalence was 40% the predicted ICC was 0.075 (GPRD) or 0.046 (HTA).

Conclusion: The prevalence of an outcome may be used to make an informed assumption about the magnitude of the intraclass correlation coefficient.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Data Interpretation, Statistical*
  • Databases, Factual
  • Family Practice
  • Health Services Research / methods*
  • Humans
  • Prevalence
  • Primary Health Care*
  • Randomized Controlled Trials as Topic / methods*
  • Research Design
  • Technology Assessment, Biomedical / methods
  • United Kingdom