Two-stage methods for the analysis of pooled data

Stat Med. 2001 Jul 30;20(14):2115-30. doi: 10.1002/sim.852.


Epidemiologic studies of disease often produce inconclusive or contradictory results due to small sample sizes or regional variations in the disease incidence or the exposures. To clarify these issues, researchers occasionally pool and reanalyse original data from several large studies. In this paper we explore the use of a two-stage random-effects model for analysing pooled case-control studies and undertake a thorough examination of bias in the pooled estimator under various conditions. The two-stage model analyses each study using the model appropriate to the design with study-specific confounders, and combines the individual study-specific adjusted log-odds ratios using a linear mixed-effects model; it is computationally simple and can incorporate study-level covariates and random effects. Simulations indicate that when the individual studies are large, two-stage methods produce nearly unbiased exposure estimates and standard errors of the exposure estimates from a generalized linear mixed model. By contrast, joint fixed-effects logistic regression produces attenuated exposure estimates and underestimates the standard error when heterogeneity is present. While bias in the pooled regression coefficient increases with interstudy heterogeneity for both models, it is much smaller using the two-stage model. In pooled analyses, where covariates may not be uniformly defined and coded across studies, and occasionally not measured in all studies, a joint model is often not feasible. The two-stage method is shown to be a simple, valid and practical method for the analysis of pooled binary data. The results are applied to a study of reproductive history and cutaneous melanoma risk in women using data from ten large case-control studies.

Publication types

  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Adolescent
  • Adult
  • Case-Control Studies*
  • Computer Simulation
  • Contraceptives, Oral / administration & dosage
  • Data Interpretation, Statistical*
  • Female
  • Humans
  • Linear Models
  • Melanoma / etiology
  • Meta-Analysis as Topic
  • Middle Aged
  • Models, Biological*
  • Models, Statistical*
  • Multicenter Studies as Topic
  • Pregnancy


  • Contraceptives, Oral