Missing data on the Center for Epidemiologic Studies Depression Scale: a comparison of 4 imputation techniques

Res Social Adm Pharm. 2007 Mar;3(1):1-27. doi: 10.1016/j.sapharm.2006.04.001.


Background: Missing data are widespread in the medical sciences. Given their prevalence, researchers must be prepared to address problems that arise when data are missing.

Objectives: The objectives were to (1) provide an estimate of bias for each imputation technique with known values from data engineered to be missing completely at random; (2) determine whether different Center for Epidemiologic Studies Depression (CES-D) Scale scores were obtained from item-mean, person-mean, regression, and hot-deck imputation techniques and whether they differed from the CES-D score obtained from complete cases; and (3) determine whether the variables that predicted the CES-D scores were the same for the complete cases and each of the 4 imputation techniques.

Methods: Depressive symptoms were assessed in patients (N=2,317) in an international clinical trial comparing high blood pressure treatments between April 1, 1999, and October 31, 1999. Patients were mailed surveys after randomization. Depressive symptoms were measured using the CES-D Scale. Respondents who completed all 20 items were compared with those who did not complete all 20 items, using independent t tests and chi-square. Z scores were used to determine CES-D mean differences, and multiple regression models were used to predict the CES-D scores for the 4 imputation techniques and the complete case data.

Results: Imputed CES-D mean scores ranged from 14.58 to 14.68. The 4 imputed CES-D mean scores were consistently, but not significantly, higher than the complete case CES-D mean of 14.06. Imputed mean scores were similar to each other and the complete case mean score. Four regression models predicting the imputed CES-D scores yielded similar predictions. With the exception of sex, the same variables predicted the complete case CES-D and the imputed CES-D scores.

Conclusions: All the imputed means were similar to the complete case mean, with the exception of the regression imputation. Imputing missing data did not significantly alter the conclusions regarding which factors were associated with variations in CES-D scores. Since imputation has the potential to increase statistical power, researchers dealing with missing CES-D scores should consider imputing missing data.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Data Interpretation, Statistical
  • Depression / diagnosis*
  • Depression / epidemiology*
  • Female
  • Humans
  • Male
  • Outcome Assessment, Health Care / methods
  • Outcome Assessment, Health Care / statistics & numerical data*
  • Psychiatric Status Rating Scales*
  • Psychometrics / methods*
  • Quality of Life
  • Random Allocation
  • Regression Analysis
  • Reproducibility of Results
  • Research Design
  • Self-Assessment*
  • Surveys and Questionnaires