Analysis of longitudinal binary data with missing data due to dropouts

J Biopharm Stat. 2005;15(6):993-1007. doi: 10.1080/10543400500266692.


Longitudinal binary data from clinical trials with missing observations are frequently analyzed by using the Last Observation Carry Forward (LOCF) method for imputing missing values at a visit (e.g., the prospectively defined primary visit time point for analysis at the end of treatment period). Usually, to understand time trend in treatment response, analyses are also performed separately on data at intermediate time points. The objective of such analyses is to estimate the proportion of "response" at a time point and then to compare two treatment groups (e.g., drug vs. placebo) by testing for the difference in the two proportions of response. The commonly used methods are Fisher's exact test, chi-squared test, Cochran-Mantel-Haenszel test, and logistic regression. Analyses based on the Observed Cases (OC) data are usually also performed and compared with those obtained by LOCF. Another approach that is gaining popularity (after the introduction of PROC GENMOD by the SAS Institute) is to use the method of Generalized Estimating Equations (GEE) with a view to include all repeated observations in the analysis in a more comprehensive manner. It is now well recognized, however, that results obtained by these methods are susceptible to bias, depending on the "missing data mechanism." Of particular concern is the bias introduced by NMAR dropouts. Because there is no one method to satisfactorily handle dropouts in data analysis, consensus is gathering toward doing analyses by several methods (including methods to handle NMAR dropouts) to evaluate sensitivity of results to model assumptions. In this article, we demonstrate application of the following methods for handling dropouts in longitudinal binary data: Generalized Linear Mixture Models (GLMM) (for handling NMAR dropouts), Weighted GEE (for handling MAR dropouts), and GEE (MCAR dropouts). The results are also compared with those obtained by logistic regression (univariate) on both LOCF and OC data.

MeSH terms

  • Clinical Trials as Topic / statistics & numerical data*
  • Data Interpretation, Statistical*
  • Linear Models
  • Logistic Models
  • Longitudinal Studies*
  • Patient Dropouts / statistics & numerical data*