Objectives: The objective of this work was to contribute to the development, validation and application of data mining methods for prediction in decision support systems in medicine. The particular focus was on the prediction of cardiovascular risk factors in hemodialysis patients, specifically the interventricular septum (IVS) thickness of the heart of individual patients as an important quantitative indicator to diagnose left ventricular hypertrophy. The work was based on data from 63 long-term hemodialysis patients of the KfH Dialysis Centre in Jena, Germany.
Methods: The approach applied is based on data mining methods and involves four major steps: data based clustering, cluster based rule extraction, rulebase construction and cluster and rule based prediction. The methods employed include crisp and fuzzy algorithms. At each step, logical and medical validation of results was carried out. Different sets of randomly selected patient data were used to train, test and optimize the clusterbases and rulebases for prediction.
Results: Using the best clusterbase/rulebase combination designed, the IVS thickness cluster ('small' or 'large') was predicted correctly for 30 of the 35 patients with known IVS values in the training data set; no patient was predicted incorrectly and 5 were parity predicted. For the test data set, 4 of the 6 patients with known IVS values were predicted correctly, no patient incorrectly and 2 parity. These results did not substantially differ from those obtained using the second best clusterbase/rulebase combination which was finally recommended for use based on further performance criteria. The prediction of the IVS thickness clusters of the 22 patients with unknown IVS values also yielded good results that were (and could only be) validated by a medical individual risk assessment of these patients.
Conclusions: The approach applied proved successful for the cluster and rule based prediction of a quantitative variable, such as IVS thickness, for individual patients from other variables relevant to the problem. The results obtained demonstrate the high potential of the approach and the methods developed and validated to support decision-making in hemodialysis and other fields of medicine by individual risk prediction.