An overview of practical approaches for handling missing data in clinical trials

J Biopharm Stat. 2009 Nov;19(6):1055-73. doi: 10.1080/10543400903242795.


For a variety of reasons including poorly designed case report forms (CRFs), incomplete or invalid CRF data entries, and premature treatment or study discontinuations, missing data is a common phenomenon in controlled clinical trials. With the widely accepted use of the intent-to-treat (ITT) analysis dataset as the primary analysis dataset for the analysis of controlled clinical trial data, the presence of missing data could lead to complicated data analysis strategies and subsequently to controversy in the interpretation of trial results. In this article, we review the mechanisms of missing data and some common approaches to analyzing missing data with an emphasis on study dropouts. We discuss the importance of understanding the reasons for study dropouts with ways to assess the mechanisms of missingness. Finally, we discuss the results of a comparative Monte Carlo investigation of the performance characteristics of commonly utilized statistical methods for the analysis of clinical trial data with dropouts. The methods investigated include the mixed effects model for repeated measurements (MMRM), weighted and unweighted generalized estimating equations (GEE) method for the available case data, multiple-imputation-based GEE (MI-GEE), complete case (CC) analysis of covariance (ANCOVA), and last observation carried forward (LOCF) ANCOVA. Simulation experiments for the repeated measures model with missing at random (MAR) dropout, under varying dropout rates and intrasubject correlation, show that the LOCF, ANCOVA, and weighted GEE methods perform poorly in terms of percent relative bias for estimating a difference in means effect, while the MI-GEE and weighted GEE methods both have less power for rejecting a zero difference in means hypothesis.

Publication types

  • Review

MeSH terms

  • Arthritis, Rheumatoid / drug therapy
  • Clinical Trials as Topic / statistics & numerical data*
  • Computer Simulation
  • Data Collection
  • Data Interpretation, Statistical*
  • Humans
  • Longitudinal Studies
  • Patient Dropouts