As the use of machine learning algorithms in the development of clinical prediction models has increased, researchers are becoming more aware of the deleterious effects that stem from the lack of reporting standards. One of the most obvious consequences is the insufficient reproducibility found in current prediction models. In an attempt to characterize methods to improve reproducibility and to allow for better clinical performance, we utilize a previously proposed taxonomy that separates reproducibility into 3 components: technical, statistical, and conceptual reproducibility. By following this framework, we discuss common errors that lead to poor reproducibility, highlight the importance of generalizability when evaluating a ML model's performance, and provide suggestions to optimize generalizability to ensure adequate performance. These efforts are a necessity before such models are applied to patient care.
Keywords: Machine learning; Overfitting; Predictive modeling; Reproducibility.
Copyright © 2020 Elsevier Inc. All rights reserved.