Using bivariate models to understand between- and within-cluster regression coefficients, with application to twin data

Biometrics. 2006 Sep;62(3):745-51. doi: 10.1111/j.1541-0420.2006.00561.x.


In the regression analysis of clustered data it is important to allow for the possibility of distinct between- and within-cluster exposure effects on the outcome measure, represented, respectively, by regression coefficients for the cluster mean and the deviation of the individual-level exposure value from this mean. In twin data, the within-pair regression effect represents association conditional on exposures shared within pairs, including any common genetic or environmental influences on the outcome measure. It has therefore been proposed that a comparison of the within-pair regression effects between monozygous (MZ) and dizygous (DZ) twins can be used to examine whether the association between exposure and outcome has a genetic origin. We address this issue by proposing a bivariate model for exposure and outcome measurements in twin-pair data. The between- and within-pair regression coefficients are shown to be weighted averages of ratios of the exposure and outcome variances and covariances, from which it is straightforward to determine the conditions under which the within-pair regression effect in MZ pairs will be different from that in DZ pairs. In particular, we show that a correlation structure in twin pairs for exposure and outcome that appears to be due to genetic factors will not necessarily be reflected in distinct MZ and DZ values for the within-pair regression coefficients. We illustrate these results in a study of female twin pairs from Australia and North America relating mammographic breast density to weight and body mass index.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Australia
  • Biometry / methods*
  • Body Mass Index
  • Body Weight
  • Breast / anatomy & histology
  • Cluster Analysis
  • Data Interpretation, Statistical
  • Female
  • Humans
  • Mammography / statistics & numerical data
  • Markov Chains
  • Models, Biological
  • Models, Statistical
  • Monte Carlo Method
  • North America
  • Regression Analysis*
  • Twin Studies as Topic / statistics & numerical data*
  • Twins, Dizygotic
  • Twins, Monozygotic