Foundations of Machine Learning-Based Clinical Prediction Modeling: Part II-Generalization and Overfitting

Julius M Kernbach; Victor E Staartjes

doi:10.1007/978-3-030-85292-4_3

Foundations of Machine Learning-Based Clinical Prediction Modeling: Part II-Generalization and Overfitting

Acta Neurochir Suppl. 2022:134:15-21. doi: 10.1007/978-3-030-85292-4_3.

Authors

Julius M Kernbach^#¹, Victor E Staartjes^#²

Affiliations

¹ Neurosurgical Artificial Intelligence Laboratory Aachen (NAILA), Department of Neurosurgery, RWTH Aachen University Hospital, Aachen, Germany.
² Machine Intelligence in Clinical Neuroscience (MICN) Laboratory, Department of Neurosurgery, Clinical Neuroscience Center, University Hospital Zurich, University of Zurich, Zurich, Switzerland. victoregon.staartjes@usz.ch.

^# Contributed equally.

PMID: 34862523
DOI: 10.1007/978-3-030-85292-4_3

Abstract

We review the concept of overfitting, which is a well-known concern within the machine learning community, but less established in the clinical community. Overfitted models may lead to inadequate conclusions that may wrongly or even harmfully shape clinical decision-making. Overfitting can be defined as the difference among discriminatory training and testing performance, while it is normal that out-of-sample performance is equal to or ever so slightly worse than training performance for any adequately fitted model, a massively worse out-of-sample performance suggests relevant overfitting. We delve into resampling methods, specifically recommending k-fold cross-validation and bootstrapping to arrive at realistic estimates of out-of-sample error during training. Also, we encourage the use of regularization techniques such as L1 or L2 regularization, and to choose an appropriate level of algorithm complexity for the type of dataset used. Data leakage is addressed, and the importance of external validation to assess true out-of-sample performance and to-upon successful external validation-release the model into clinical practice is discussed. Finally, for highly dimensional datasets, the concepts of feature reduction using principal component analysis (PCA) as well as feature elimination using recursive feature elimination (RFE) are elucidated.

Keywords: Artificial intelligence; Clinical prediction model; Machine intelligence; Machine learning; Prediction; Prognosis.

Publication types

Review

MeSH terms

Algorithms*
Machine Learning*