Background: Health-related quality of life (HRQL) has become an increasingly important outcome parameter in clinical trials and epidemiological research. HRQL scores are typically bounded at both ends of the scale and often highly skewed. Several regression techniques have been proposed to model such data in cross-sectional studies, however, methods applicable in longitudinal research are less well researched. This study examined the use of beta regression models for analyzing longitudinal HRQL data using two empirical examples with distributional features typically encountered in practice.
Methods: We used SF-6D utility data from a German older age cohort study and stroke-specific HRQL data from a randomized controlled trial. We described the conceptual differences between mixed and marginal beta regression models and compared both models to the commonly used linear mixed model in terms of overall fit and predictive accuracy.
Results: At any measurement time, the beta distribution fitted the SF-6D utility data and stroke-specific HRQL data better than the normal distribution. The mixed beta model showed better likelihood-based fit statistics than the linear mixed model and respected the boundedness of the outcome variable. However, it tended to underestimate the true mean at the upper part of the distribution. Adjusted group means from marginal beta model and linear mixed model were nearly identical but differences could be observed with respect to standard errors.
Conclusions: Understanding the conceptual differences between mixed and marginal beta regression models is important for their proper use in the analysis of longitudinal HRQL data. Beta regression fits the typical distribution of HRQL data better than linear mixed models, however, if focus is on estimating group mean scores rather than making individual predictions, the two methods might not differ substantially.