A better lemon squeezer? Maximum-likelihood regression with beta-distributed dependent variables

Psychol Methods. 2006 Mar;11(1):54-71. doi: 10.1037/1082-989X.11.1.54.


Uncorrectable skew and heteroscedasticity are among the "lemons" of psychological data, yet many important variables naturally exhibit these properties. For scales with a lower and upper bound, a suitable candidate for models is the beta distribution, which is very flexible and models skew quite well. The authors present maximum-likelihood regression models assuming that the dependent variable is conditionally beta distributed rather than Gaussian. The approach models both means (location) and variances (dispersion) with their own distinct sets of predictors (continuous and/or categorical), thereby modeling heteroscedasticity. The location sub-model link function is the logit and thereby analogous to logistic regression, whereas the dispersion sub-model is log linear. Real examples show that these models handle the independent observations case readily. The article discusses comparisons between beta regression and alternative techniques, model selection and interpretation, practical estimation, and software.

MeSH terms

  • Analysis of Variance
  • Bias
  • Child
  • Data Interpretation, Statistical
  • Dyslexia / epidemiology
  • Humans
  • Least-Squares Analysis
  • Likelihood Functions*
  • Linear Models
  • Models, Statistical*
  • Normal Distribution
  • Regression Analysis*
  • Reproducibility of Results