Explained variation for logistic regression

Stat Med. 1996 Oct 15;15(19):1987-97. doi: 10.1002/(SICI)1097-0258(19961015)15:19<1987::AID-SIM318>3.0.CO;2-9.


Different measures of the proportion of variation in a dependent variable explained by covariates are reported by different standard programs for logistic regression. We review twelve measures that have been suggested or might be useful to measure explained variation in logistic regression models. The definitions and properties of these measures are discussed and their performance is compared in an empirical study. Two of the measures (squared Pearson correlation between the binary outcome and the predictor, and the proportional reduction of squared Pearson residuals by the use of covariates) give almost identical results, agree very well with the multiple R2 of the general linear model, have an intuitively clear interpretation and perform satisfactorily in our study. For all measures the explained variation for the given sample and also the one expected in future samples can be obtained easily. For small samples an adjustment analogous to Radj2 in the general linear model is suggested. We discuss some aspects of application and recommend the routine use of a suitable measure of explained variation for logistic models.

MeSH terms

  • Likelihood Functions
  • Linear Models
  • Logistic Models*
  • Observer Variation
  • Statistics, Nonparametric*