Development and validation of a prediction model with missing predictor data: a practical approach

J Clin Epidemiol. 2010 Feb;63(2):205-14. doi: 10.1016/j.jclinepi.2009.03.017. Epub 2009 Jul 12.

Abstract

Objective: To illustrate the sequence of steps needed to develop and validate a clinical prediction model, when missing predictor values have been multiply imputed.

Study design and setting: We used data from consecutive primary care patients suspected of deep venous thrombosis (DVT) to develop and validate a diagnostic model for the presence of DVT. Missing values were imputed 10 times with the MICE conditional imputation method. After the selection of predictors and transformations for continuous predictors according to three different methods, we estimated regression coefficients and performance measures.

Results: The three methods to select predictors and transformations of continuous predictors showed similar results. Rubin's rules could easily be applied to estimate regression coefficients and performance measures, once predictors and transformations were selected.

Conclusion: We provide a practical approach for model development and validation with multiply imputed data.

Publication types

  • Multicenter Study
  • Research Support, Non-U.S. Gov't
  • Validation Study

MeSH terms

  • Adult
  • Aged
  • Aged, 80 and over
  • Cross-Sectional Studies
  • Data Interpretation, Statistical*
  • Female
  • Humans
  • Male
  • Middle Aged
  • Models, Statistical*
  • Primary Health Care / methods
  • Reproducibility of Results
  • Risk Factors
  • Venous Thrombosis / diagnosis