Penalized maximum likelihood estimation to directly adjust diagnostic and prognostic prediction models for overoptimism: a clinical example

J Clin Epidemiol. 2004 Dec;57(12):1262-70. doi: 10.1016/j.jclinepi.2004.01.020.


Background and objective: There is growing interest in developing prediction models. The accuracy of such models when applied in new patient samples is commonly lower than estimated from the development sample. This may be because of differences between the samples and/or because the developed model was overfitted (too optimistic). Various methods, including bootstrapping techniques exist for afterwards shrinking the regression coefficients and the model's discrimination and calibration for overoptimism. Penalized maximum likelihood estimation (PMLE) is a more rigorous method because adjustment for overfitting is directly built into the model development, instead of relying on shrinkage afterwards. PMLE has been described mainly in the statistical literature and is rarely applied to empirical data. Using empirical data, we illustrate the use of PMLE to develop a prediction model.

Methods: The accuracy of the final PMLE model will be contrasted with the final models derived by ordinary stepwise logistic regression without and with shrinkage afterwards. The potential advantages and disadvantages of PMLE over the other two strategies are discussed.

Results: PMLE leads to smaller prediction errors, provides for model reduction to a user-defined degree, and may differently shrink each predictor for overoptimism without sacrificing much discriminative accuracy of the model.

Conclusion: PMLE is an easily applicable and promising method to directly adjust clinical prediction models for overoptimism.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bias*
  • Female
  • Humans
  • Likelihood Functions*
  • Male
  • Middle Aged
  • Models, Statistical*
  • Prognosis
  • Pulmonary Embolism / diagnosis
  • Pulmonary Embolism / therapy
  • Treatment Outcome