Missing data and imputation: a practical illustration in a prognostic study on low back pain

J Manipulative Physiol Ther. 2012 Jul;35(6):464-71. doi: 10.1016/j.jmpt.2012.07.002.

Abstract

Objective: When designing prediction models by complete case analysis (CCA), missing information in either baseline (predictors) or outcomes may lead to biased results. Multiple imputation (MI) has been shown to be suitable for obtaining unbiased results. This study provides researchers with an empirical illustration of the use of MI in a data set on low back pain, by comparing MI with the more commonly used CCA. Effects will be shown of imputing missing information on the composition and performance of prognostic models, distinguishing imputation of missing values in baseline characteristics and outcome data.

Methods: Data came from the Beliefs about Backpain cohort, a study of psychologic obstacles to recovery in primary care back pain patients in the United Kingdom. Candidate predictors included demographics, back pain characteristics, and psychologic variables. Complete case analysis was compared with MI within patients with complete outcome but missing baseline data (n=809) and patients with missing baseline or outcome data (n=1591). Multiple imputation was performed by a Multiple Imputation by Chained Equations procedure.

Results: Cases with missing outcome data (n=782, 49.1%) or with missing baseline data (n=116, 8%) both differed from complete cases regarding the distribution of some predictors and more often had a poor outcome. When comparing CCA with MI, model composition showed to be affected.

Conclusions: Complete case analysis can give biased results, even when only small amounts of data are missing. Now that MI is available in standard statistical software, we recommend that it be used to handle missing data.

Publication types

  • Comparative Study

MeSH terms

  • Adult
  • Aged
  • Bias
  • Cohort Studies
  • Computer Simulation
  • Data Interpretation, Statistical*
  • Databases, Factual
  • Exercise Therapy / methods
  • Female
  • Humans
  • Low Back Pain / diagnosis
  • Low Back Pain / epidemiology*
  • Low Back Pain / rehabilitation*
  • Male
  • Middle Aged
  • Models, Statistical*
  • Outcome Assessment, Health Care / methods*
  • Predictive Value of Tests
  • Prognosis
  • Research Design
  • Retrospective Studies
  • Severity of Illness Index