Machine learning algorithm-based risk prediction model of coronary artery disease

Mol Biol Rep. 2018 Oct;45(5):901-910. doi: 10.1007/s11033-018-4236-2. Epub 2018 Jul 11.

Abstract

In view of high mortality associated with coronary artery disease (CAD), development of an early predicting tool will be beneficial in reducing the burden of the disease. The database comprising demographic, conventional, folate/xenobiotic genetic risk factors of 648 subjects (364 cases of CAD and 284 healthy controls) was used as the basis to develop CAD risk and percentage stenosis prediction models using ensemble machine learning algorithms (EMLA), multifactor dimensionality reduction (MDR) and recursive partitioning (RP). The EMLA model showed better performance than other models in disease (89.3%) and stenosis prediction (82.5%). This model depicted hypertension and alcohol intake as the key predictors of CAD risk followed by cSHMT C1420T, GCPII C1561T, diabetes, GSTT1, CYP1A1 m2, TYMs 5'-UTR 28 bp tandem repeat and MTRR A66G. MDR and RP models are in agreement in projecting increasing age, hypertension and cSHMTC1420T as the key determinants interacting in modulating CAD risk. Receiver operating characteristic curves exhibited clinical utility of the developed models in the following order: EMLA (C = 0.96) > RP (C = 0.83) > MDR (C = 0.80). The stenosis prediction model showed that xenobiotic pathway genetic variants i.e. CYP1A1 m2 and GSTT1 are the key determinants of percentage of stenosis. Diabetes, diet, alcohol intake, hypertension and MTRR A66G are the other determinants of stenosis. These eleven variables contribute towards 82.5% stenosis. To conclude, the EMLA model exhibited higher predictability both in terms of disease prediction and stenosis prediction. This can be attributed to higher number of iterations in EMLA model that can increase the prediction accuracy.

Keywords: Coronary artery disease; Ensemble machine learning algorithm; Folate and xenobiotic pathways; Multifactor dimensionality reduction; Recursive partitioning.

MeSH terms

  • Adult
  • Aged
  • Algorithms
  • Case-Control Studies
  • Coronary Artery Disease / genetics*
  • Coronary Artery Disease / mortality
  • Cytochrome P-450 CYP1A1 / genetics
  • Cytochrome P-450 CYP1A1 / metabolism
  • Epistasis, Genetic / genetics
  • Female
  • Folic Acid / metabolism
  • Forecasting / methods*
  • Genetic Predisposition to Disease / genetics
  • Glycine Hydroxymethyltransferase / genetics
  • Glycine Hydroxymethyltransferase / metabolism
  • Humans
  • Machine Learning
  • Male
  • Middle Aged
  • Multifactor Dimensionality Reduction / methods*
  • Polymorphism, Single Nucleotide / genetics
  • Risk Factors
  • Xenobiotics / metabolism

Substances

  • Xenobiotics
  • Folic Acid
  • CYP1A1 protein, human
  • Cytochrome P-450 CYP1A1
  • Glycine Hydroxymethyltransferase
  • SHMT protein, human