Patient-reported outcomes are increasingly used in health research, including randomized controlled trials and observational studies. However, the validity of results in longitudinal studies can crucially hinge on the handling of missing data. This paper considers the issues of missing data at each stage of research. Practical strategies for minimizing missingness through careful study design and conduct are given. Statistical approaches that are commonly used, but should be avoided, are discussed, including how these methods can yield biased and misleading results. Methods that are valid for data which are missing at random are outlined, including maximum likelihood methods, multiple imputation and extensions to generalized estimating equations: weighted generalized estimating equations, generalized estimating equations with multiple imputation, and doubly robust generalized estimating equations. Finally, we discuss the importance of sensitivity analyses, including the role of missing not at random models, such as pattern mixture, selection, and shared parameter models. We demonstrate many of these concepts with data from a randomized controlled clinical trial on renal cancer patients, and show that the results are dependent on missingness assumptions and the statistical approach.
Keywords: Missing data; cancer; generalized estimating equations; maximum likelihood estimation; multiple imputation; patient reported outcomes; quality of life.
© The Author(s) 2013 Reprints and permissions: sagepub.co.uk/journalsPermissions.nav.