A novel lifelong machine learning-based method to eliminate calibration drift in clinical prediction models

Shengqiang Chi; Yu Tian; Feng Wang; Tianshu Zhou; Shan Jin; Jingsong Li

doi:10.1016/j.artmed.2022.102256

A novel lifelong machine learning-based method to eliminate calibration drift in clinical prediction models

Artif Intell Med. 2022 Mar:125:102256. doi: 10.1016/j.artmed.2022.102256. Epub 2022 Feb 12.

Authors

Shengqiang Chi¹, Yu Tian², Feng Wang¹, Tianshu Zhou¹, Shan Jin³, Jingsong Li⁴

Affiliations

¹ Research Center for Healthcare Data Science, Zhejiang Lab, Hangzhou, China.
² Engineering Research Center of EMR and Intelligent Expert Systems, Ministry of Education, Key Laboratory for Biomedical Engineering of Ministry of Education, College of Biomedical Engineering and Instrument Science, Zhejiang University, Hangzhou, China.
³ Zhejiang Topcheer Information Technology Co., Ltd, China.
⁴ Research Center for Healthcare Data Science, Zhejiang Lab, Hangzhou, China; Engineering Research Center of EMR and Intelligent Expert Systems, Ministry of Education, Key Laboratory for Biomedical Engineering of Ministry of Education, College of Biomedical Engineering and Instrument Science, Zhejiang University, Hangzhou, China. Electronic address: ljs@zju.edu.cn.

PMID: 35241261
DOI: 10.1016/j.artmed.2022.102256

Abstract

Objective: Clinical prediction models (CPMs) constructed based on artificial intelligence have been proven to have positive impacts on clinical activities. However, the deterioration of CPM performance over time has rarely been studied. This paper proposes a model updating method to solve the calibration drift issue caused by data drift.

Materials and methods: This paper proposes a novel model updating method based on lifelong machine learning (LML). The effectiveness of the proposed method is verified in four tumor datasets, and a comprehensive comparison with other model updating methods is performed.

Results: Changes in data distributions cause model performances to drift. The four compared model updating methods have different effects in terms of improving the discrimination and calibration abilities of the tested models. The LML method proposed in this study improves model performance better than or equivalent to the other methods. The proposed method achieved a mean AUC of 0.8249, 0.8780, 0.8261, and 0.8489, a mean AUPRC of 0.7782, 0.9730, 0.4655, and 0.5728, a mean F1 of 0.6866, 0.9552, 0.2985, and 0.3585, and a mean estimated calibration index (ECI) of 0.0320, 0.0338, 0.0101, and 0.0115 using colorectal, lung, breast and prostate cancer datasets.

Discussion: The LML framework simultaneously monitors model performance and the distribution of disease risk characteristics, enabling it to effectively address the performance degradation caused by gradual and sudden data drifts and provide reasonable explanations for the causes of performance degradation.

Conclusion: Monitoring model performance and the underlying data distribution can promote model life cycle iteration with "development-deployment-maintenance-monitoring" as the core, which, in turn, ensures that the model can provide accurate predictions, guides the model update process and explains the causes of model performance changes.

Keywords: Calibration; Cancer; Clinical prediction models; Knowledge distillation; Lifelong machine learning.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Artificial Intelligence*
Calibration
Humans
Machine Learning
Male
Models, Statistical*
Prognosis