Marginal modeling of nonnested multilevel data using standard software

Am J Epidemiol. 2007 Feb 15;165(4):453-63. doi: 10.1093/aje/kwk020. Epub 2006 Nov 22.


Epidemiologic data are often clustered within multiple levels that may not be nested within each other. Generalized estimating equations are commonly used to adjust for correlation among observations within clusters when fitting regression models; however, standard software does not currently accommodate nonnested clusters. This paper introduces a simple generalized estimating equation strategy that uses available commercial or public software for the regression analysis of nonnested multilevel data. The authors describe how to obtain empirical standard error estimates for constructing valid confidence intervals and conducting statistical hypothesis tests. The method is evaluated using simulations and illustrated with an analysis of data from the Breast Cancer Surveillance Consortium that estimates the influence of woman, radiologist, and facility characteristics on the positive predictive value of screening mammography. Performance with a small number of clusters is discussed. Both the simulations and the example demonstrate the importance of accounting for the correlation within all levels of clustering for proper inference.

Publication types

  • Multicenter Study
  • Research Support, N.I.H., Extramural

MeSH terms

  • Adult
  • Aged
  • Breast Neoplasms / diagnostic imaging
  • Breast Neoplasms / epidemiology*
  • Cluster Analysis
  • Female
  • Humans
  • Mammography
  • Mass Screening / methods*
  • Middle Aged
  • Models, Statistical*
  • Morbidity / trends
  • Prognosis
  • Software / standards*