Objective: IgA Nephropathy (IgAN) is a common kidney disease which may entail renal failure, known as End Stage Kidney Disease (ESKD). One of the major difficulties dealing with this disease is to predict the time of the long-term prognosis for a patient at the time of diagnosis. In fact, the progression of IgAN to ESKD depends on an intricate interrelationship between clinical and laboratory findings. Therefore, the objective of this work has been the selection of the best data mining tool to build a model able to predict (I) if a patient with a biopsy proven IgAN will reach ESKD and (II) if a patient will reach the ESKD before or after 5 years.
Material and methods: The largest available cohort study worldwide on IgAN has been used to design and compare several data-driven models. The complete dataset was composed of 1174 records collected from Italian, Norwegian, and Japanese IgAN patients, in the last 30 years. The data mining tools considered in this work were artificial neural networks (ANNs), neuro fuzzy systems (NFSs), support vector machines (SVMs), and decision trees (DTs). A 10-fold cross validation was used to evaluate unbiased performances for all the models.
Results: An extensive model comparison based on accuracy, precision, recall, and f-measure was provided. Overall, the results indicate that ANNs can provide superior performance compared to the other models. The ANN for time-to-ESKD prediction is characterized by accuracy, precision, recall, and f-measure greater than 90%. The ANN for ESKD prediction has accuracy greater than 90% as well as precision, recall, and f-measure for the class of patients not reaching ESKD, while precision, recall, and f-measure for the class of patients reaching ESKD are slightly lower. The obtained model has been implemented in a Web-based decision support system (DSS).
Conclusions: The extraction of novel knowledge from clinical data and the definition of predictive models to support diagnosis, prognosis, and therapy is becoming an essential tool for researchers and clinical practitioners in medicine. The proposed comparative study of several data mining models for the outcome prediction in IgAN patients, using a large dataset of clinical records from three different countries, provides an insight into the relative prediction ability of the considered methods applied to such a disease.
Keywords: Artificial neural network; Decision support system; Decision tree; IgA Nephropathy; Neuro fuzzy system; Support vector machine.
Copyright © 2015 Elsevier Ltd. All rights reserved.