Multiple Imputation: Dealing With Missing Data

Nephrol Dial Transplant. 2013 Oct;28(10):2415-20. doi: 10.1093/ndt/gft221. Epub 2013 May 31.

Abstract

In many fields, including the field of nephrology, missing data are unfortunately an unavoidable problem in clinical/epidemiological research. The most common methods for dealing with missing data are complete case analysis-excluding patients with missing data--mean substitution--replacing missing values of a variable with the average of known values for that variable-and last observation carried forward. However, these methods have severe drawbacks potentially resulting in biased estimates and/or standard errors. In recent years, a new method has arisen for dealing with missing data called multiple imputation. This method predicts missing values based on other data present in the same patient. This procedure is repeated several times, resulting in multiple imputed data sets. Thereafter, estimates and standard errors are calculated in each imputation set and pooled into one overall estimate and standard error. The main advantage of this method is that missing data uncertainty is taken into account. Another advantage is that the method of multiple imputation gives unbiased results when data are missing at random, which is the most common type of missing data in clinical practice, whereas conventional methods do not. However, the method of multiple imputation has scarcely been used in medical literature. We, therefore, encourage authors to do so in the future when possible.

Keywords: complete case; last observation carried forward; mean substitution; missing data; multiple imputation.

MeSH terms

  • Data Collection / methods*
  • Data Collection / statistics & numerical data*
  • Data Interpretation, Statistical*
  • Humans
  • Research Design*