Machine Learning Coronary Artery Disease Prediction Based on Imaging and Non-Imaging Data

Diagnostics (Basel). 2022 Jun 14;12(6):1466. doi: 10.3390/diagnostics12061466.


The prediction of obstructive atherosclerotic disease has significant clinical meaning for the decision making. In this study, a machine learning predictive model based on gradient boosting classifier is presented, aiming to identify the patients of high CAD risk and those of low CAD risk. The machine learning methodology includes five steps: the preprocessing of the input data, the class imbalance handling applying the Easy Ensemble algorithm, the recursive feature elimination technique implementation, the implementation of gradient boosting classifier, and finally the model evaluation, while the fine tuning of the presented model was implemented through a randomized search optimization of the model's hyper-parameters over an internal 3-fold cross-validation. In total, 187 participants with suspicion of CAD previously underwent CTCA during EVINCI and ARTreat clinical studies and were prospectively included to undergo follow-up CTCA. The predictive model was trained using imaging data (geometrical and blood flow based) and non-imaging data. The overall predictive accuracy of the model was 0.81, using both imaging and non-imaging data. The innovative aspect of the proposed study is the combination of imaging-based data with the typical CAD risk factors to provide an integrated CAD risk-predictive model.

Keywords: coronary artery disease; coronary artery disease risk stratification; machine learning models; noninvasive cardiovascular imaging.