Using Data Mining and Association Rules for Early Diagnosis of Esophageal Cancer

Gulf J Oncolog. 2022 Sep;1(40):38-46.

Abstract

From 17,000 new cases of esophageal cancer worldwide during last year, 16,000 proved to be fatal. Late or incorrect diagnosis of esophageal cancer cases increases its fatality rate. Today, a data-mining technique can predict the course of the disease with the help of an upto-date technology. With this knowledge, we can reduce esophageal cancer mortality. This study aims to find an association between general characteristics, screening tests, and esophageal cancer based on raw data from the Cancer Research Center within-person interviews, using data mining and classification techniques on mortality. The 5-year medical records of 512 esophageal cancer patients and those with problems related to this cancer, with 50 functional characteristics, were included in this model. In order to provide a prognostic and rule discovery model for esophageal cancer suffering, we used preprocessing EM Algorithm. After accurate identification of the data, WEKA Software tools and Java programming language was used to create Association Rule Classifier and Apriori algorithm for the associated rule discovery. We created 6 significant rules of the association for classification generated by rule miner with 95% and 91% confidence based on screening tests and general attributes, respectively. These substantial rules showed significant association between age, history of medication, smoking, gender, carcinoembryonic antigen (CEA), creatinine, WBCs, and Platelets. The findings of this study can be used as a clue for physicians to consider patients with these characteristics as people who are more likely to develop esophageal cancer and help them for early diagnosis of patients. Keywords:Data mining, esophageal cancer, association rule, healthcare.

MeSH terms

  • Data Mining
  • Early Detection of Cancer*
  • Esophageal Neoplasms* / diagnosis
  • Humans