Practical and statistical issues in missing data for longitudinal patient-reported outcomes

Stat Methods Med Res. 2014 Oct;23(5):440-59. doi: 10.1177/0962280213476378. Epub 2013 Feb 19.


Patient-reported outcomes are increasingly used in health research, including randomized controlled trials and observational studies. However, the validity of results in longitudinal studies can crucially hinge on the handling of missing data. This paper considers the issues of missing data at each stage of research. Practical strategies for minimizing missingness through careful study design and conduct are given. Statistical approaches that are commonly used, but should be avoided, are discussed, including how these methods can yield biased and misleading results. Methods that are valid for data which are missing at random are outlined, including maximum likelihood methods, multiple imputation and extensions to generalized estimating equations: weighted generalized estimating equations, generalized estimating equations with multiple imputation, and doubly robust generalized estimating equations. Finally, we discuss the importance of sensitivity analyses, including the role of missing not at random models, such as pattern mixture, selection, and shared parameter models. We demonstrate many of these concepts with data from a randomized controlled clinical trial on renal cancer patients, and show that the results are dependent on missingness assumptions and the statistical approach.

Keywords: Missing data; cancer; generalized estimating equations; maximum likelihood estimation; multiple imputation; patient reported outcomes; quality of life.

MeSH terms

  • Antineoplastic Agents / therapeutic use*
  • Clinical Trials, Phase III as Topic
  • Humans
  • Kidney Neoplasms / drug therapy*
  • Longitudinal Studies
  • Models, Statistical*
  • Patient Outcome Assessment*
  • Quality of Life
  • Randomized Controlled Trials as Topic
  • Research Design


  • Antineoplastic Agents