Hypothesis tests for population heterogeneity in meta-analysis

Br J Math Stat Psychol. 2007 May;60(Pt 1):29-60. doi: 10.1348/000711005X64042.


Choice of the appropriate model in meta-analysis is often treated as an empirical question which is answered by examining the amount of variability in the effect sizes. When all of the observed variability in the effect sizes can be accounted for based on sampling error alone, a set of effect sizes is said to be homogeneous and a fixed-effects model is typically adopted. Whether a set of effect sizes is homogeneous or not is usually tested with the so-called Q test. In this paper, a variety of alternative homogeneity tests - the likelihood ratio, Wald and score tests - are compared with the Q test in terms of their Type I error rate and power for four different effect size measures. Monte Carlo simulations show that the Q test kept the tightest control of the Type I error rate, although the results emphasize the importance of large sample sizes within the set of studies. The results also suggest under what conditions the power of the tests can be considered adequate.

MeSH terms

  • Analysis of Variance
  • Bias
  • Data Interpretation, Statistical*
  • Effect Modifier, Epidemiologic
  • Humans
  • Likelihood Functions
  • Meta-Analysis as Topic*
  • Models, Statistical
  • Monte Carlo Method
  • Reproducibility of Results