Chronic kidney disease diagnosis using decision tree algorithms

BMC Nephrol. 2021 Aug 9;22(1):273. doi: 10.1186/s12882-021-02474-z.

Abstract

Background: Chronic Kidney Disease (CKD), i.e., gradual decrease in the renal function spanning over a duration of several months to years without any major symptoms, is a life-threatening disease. It progresses in six stages according to the severity level. It is categorized into various stages based on the Glomerular Filtration Rate (GFR), which in turn utilizes several attributes, like age, sex, race and Serum Creatinine. Among multiple available models for estimating GFR value, Chronic Kidney Disease Epidemiology Collaboration (CKD-EPI), which is a linear model, has been found to be quite efficient because it allows detecting all CKD stages.

Methods: Early detection and cure of CKD is extremely desirable as it can lead to the prevention of unwanted consequences. Machine learning methods are being extensively advocated for early detection of symptoms and diagnosis of several diseases recently. With the same motivation, the aim of this study is to predict the various stages of CKD using machine learning classification algorithms on the dataset obtained from the medical records of affected people. Specifically, we have used the Random Forest and J48 algorithms to obtain a sustainable and practicable model to detect various stages of CKD with comprehensive medical accuracy.

Results: Comparative analysis of the results revealed that J48 predicted CKD in all stages better than random forest with an accuracy of 85.5%. The study also showed that J48 shows improved performance over Random Forest.

Conclusions: The study concluded that it may be used to build an automated system for the detection of severity of CKD.

Keywords: CKD; Decision tree; GFR; J48; Machine learning; Random Forest.

MeSH terms

  • Algorithms
  • Decision Trees*
  • Disease Progression*
  • Early Diagnosis
  • Female
  • Glomerular Filtration Rate*
  • Humans
  • Kidney Function Tests / methods
  • Machine Learning*
  • Male
  • Medical Records / statistics & numerical data
  • Middle Aged
  • Patient Acuity
  • Prognosis
  • Renal Insufficiency, Chronic* / diagnosis
  • Renal Insufficiency, Chronic* / physiopathology
  • Reproducibility of Results
  • Severity of Illness Index