Multiple imputation as a flexible tool for missing data handling in clinical research

Behav Res Ther. 2017 Nov:98:4-18. doi: 10.1016/j.brat.2016.11.008. Epub 2016 Nov 18.


The last 20 years has seen an uptick in research on missing data problems, and most software applications now implement one or more sophisticated missing data handling routines (e.g., multiple imputation or maximum likelihood estimation). Despite their superior statistical properties (e.g., less stringent assumptions, greater accuracy and power), the adoption of these modern analytic approaches is not uniform in psychology and related disciplines. Thus, the primary goal of this manuscript is to describe and illustrate the application of multiple imputation. Although maximum likelihood estimation is perhaps the easiest method to use in practice, psychological data sets often feature complexities that are currently difficult to handle appropriately in the likelihood framework (e.g., mixtures of categorical and continuous variables), but relatively simple to treat with imputation. The paper describes a number of practical issues that clinical researchers are likely to encounter when applying multiple imputation, including mixtures of categorical and continuous variables, item-level missing data in questionnaires, significance testing, interaction effects, and multilevel missing data. Analysis examples illustrate imputation with software packages that are freely available on the internet.

Keywords: Attrition; Maximum likelihood estimation; Missing data; Multiple imputation.

MeSH terms

  • Clinical Studies as Topic / methods*
  • Data Interpretation, Statistical
  • Humans
  • Software