Sample size and optimal design for logistic regression with binary interaction

Stat Med. 2008 Jan 15;27(1):36-46. doi: 10.1002/sim.2980.


There is no consensus on what test to use as the basis for sample size determination and power analysis. Some authors advocate the Wald test and some the likelihood-ratio test. We argue that the Wald test should be used because the Z-score is commonly applied for regression coefficient significance testing and therefore the same statistic should be used in the power function. We correct a widespread mistake on sample size determination when the variance of the maximum likelihood estimate (MLE) is estimated at null value. In our previous paper, we developed a correct sample size formula for logistic regression with single exposure (Statist. Med. 2007; 26(18):3385-3397). In the present paper, closed-form formulas are derived for interaction studies with binary exposure and covariate in logistic regression. The formula for the optimal control-case ratio is derived such that it maximizes the power function given other parameters. Our sample size and power calculations with interaction can be carried out online at approximately eugened.

MeSH terms

  • Asthma / genetics
  • Case-Control Studies
  • Environment
  • Genetics
  • Humans
  • Likelihood Functions
  • Logistic Models*
  • Research Design
  • Sample Size*