Handling drop-out in longitudinal studies

Stat Med. 2004 May 15;23(9):1455-97. doi: 10.1002/sim.1728.


Drop-out is a prevalent complication in the analysis of data from longitudinal studies, and remains an active area of research for statisticians and other quantitative methodologists. This tutorial is designed to synthesize and illustrate the broad array of techniques that are used to address outcome-related drop-out, with emphasis on regression-based methods. We begin with a review of important assumptions underlying likelihood-based and semi-parametric models, followed by an overview of models and methods used to draw inferences from incomplete longitudinal data. The majority of the tutorial is devoted to detailed analysis of two studies with substantial rates of drop-out, designed to illustrate the use of effective methods that are relatively easy to apply: in the first example, we use both semi-parametric and fully parametric models to analyse repeated binary responses from a clinical trial of smoking cessation interventions; in the second, pattern mixture models are used to analyse longitudinal CD4 counts from an observational cohort study of HIV-infected women. In each example, we describe exploratory analyses, model formulation, estimation methodology and interpretation of results. Analyses of incomplete data requires making unverifiable assumptions, and these are discussed in detail within the context of each application. Relevant SAS code is provided.

Publication types

  • Research Support, U.S. Gov't, P.H.S.
  • Review

MeSH terms

  • Biometry / methods*
  • Female
  • HIV Infections / epidemiology
  • Humans
  • Longitudinal Studies*
  • Models, Statistical
  • Patient Dropouts / statistics & numerical data*
  • Regression Analysis
  • Smoking Cessation / statistics & numerical data