Parameter estimation and goodness-of-fit in log binomial regression

Biom J. 2006 Feb;48(1):5-22. doi: 10.1002/bimj.200410165.


An estimate of the risk, adjusted for confounders, can be obtained from a fitted logistic regression model, but it substantially over-estimates when the outcome is not rare. The log binomial model, binomial errors and log link, is increasingly being used for this purpose. However this model's performance, goodness of fit tests and case-wise diagnostics have not been studied. Extensive simulations are used to compare the performance of the log binomial, a logistic regression based method proposed by Schouten et al. (1993) and a Poisson regression approach proposed by Zou (2004) and Carter, Lipsitz, and Tilley (2005). Log binomial regression resulted in "failure" rates (non-convergence, out-of-bounds predicted probabilities) as high as 59%. Estimates by the method of Schouten et al. (1993) produced fitted log binomial probabilities greater than unity in up to 19% of samples to which a log binomial model had been successfully fit and in up to 78% of samples when the log binomial model fit failed. Similar percentages were observed for the Poisson regression approach. Coefficient and standard error estimates from the three models were similar. Rejection rates for goodness of fit tests for log binomial fit were around 5%. Power of goodness of fit tests was modest when an incorrect logistic regression model was fit. Examples demonstrate the use of the methods. Uncritical use of the log binomial regression model is not recommended.

Publication types

  • Evaluation Study

MeSH terms

  • Algorithms*
  • Biometry / methods*
  • Computer Simulation
  • Data Interpretation, Statistical*
  • Linear Models*
  • Logistic Models*
  • Models, Biological
  • Numerical Analysis, Computer-Assisted
  • Proportional Hazards Models*
  • Regression Analysis*