Association rule discovery with the train and test approach for heart disease prediction

IEEE Trans Inf Technol Biomed. 2006 Apr;10(2):334-43. doi: 10.1109/titb.2006.864475.

Abstract

Association rules represent a promising technique to improve heart disease prediction. Unfortunately, when association rules are applied on a medical data set, they produce an extremely large number of rules. Most of such rules are medically irrelevant and the time required to find them can be impractical. A more important issue is that, in general, association rules are mined on the entire data set without validation on an independent sample. To solve these limitations, we introduce an algorithm that uses search constraints to reduce the number of rules, searches for association rules on a training set, and finally validates them on an independent test set. The medical significance of discovered rules is evaluated with support, confidence, and lift. Association rules are applied on a real data set containing medical records of patients with heart disease. In medical terms, association rules relate heart perfusion measurements and risk factors to the degree of disease in four specific arteries. Search constraints and test set validation significantly reduce the number of association rules and produce a set of rules with high predictive accuracy. We exhibit important rules with high confidence, high lift, or both, that remain valid on the test set on several runs. These rules represent valuable medical knowledge.

Publication types

  • Evaluation Study

MeSH terms

  • Algorithms*
  • Artificial Intelligence*
  • Decision Support Systems, Management*
  • Diagnosis, Computer-Assisted / methods*
  • Heart Diseases / diagnosis*
  • Medical Records Systems, Computerized*
  • Pattern Recognition, Automated / methods*
  • Reproducibility of Results
  • Sensitivity and Specificity