Model Selection with the Linear Mixed Model for Longitudinal Data

Multivariate Behav Res. 2011 Jul 29;46(4):598-624. doi: 10.1080/00273171.2011.589264.

Abstract

Model building or model selection with linear mixed models (LMMs) is complicated by the presence of both fixed effects and random effects. The fixed effects structure and random effects structure are codependent, so selection of one influences the other. Most presentations of LMM in psychology and education are based on a multilevel or hierarchical approach in which the variance-covariance matrix of the random effects is assumed to be positive definite with nonzero values for the variances. When the number of fixed effects and random effects is unknown, the predominant approach to model building is a step-up method in which one starts with a limited model (e.g., few fixed and random intercepts) and then additional fixed effects and random effects are added based on statistical tests. A model building approach that has received less attention in psychology and education is a top-down method. In the top-down method, the initial model has a single random intercept but is loaded with fixed effects (also known as an "overelaborate" model). Based on the overelaborate fixed effects model, the need for additional random effects is determined. There has been little if any examination of the ability of these methods to identify a true population model (i.e., identifying the model that generated the data). The purpose of this article is to examine the performance of the step-up and top-down model building approaches for exploratory longitudinal data analysis. Student achievement data sets from the Chicago longitudinal study serve as the populations in the simulations.