Accurate and rapid screening model for potential diabetes mellitus

BMC Med Inform Decis Mak. 2019 Mar 12;19(1):41. doi: 10.1186/s12911-019-0790-3.

Abstract

Background: Prediction or early diagnosis of diabetes is crucial for populations with high risk of diabetes.

Methods: In this study, we assessed the ability of five popular classifiers (J48, AdaboostM1, SMO, Bayes Net, and Naïve Bayes) to identify individuals with diabetes based on nine non-invasive and easily obtained clinical features, including age, gender, body mass index (BMI), hypertension, history of cardiovascular disease or stroke, family history of diabetes, physical activity, work stress, and salty food preference. A total of 4205 data entries were obtained from annual physical examination reports for adults in the Shengjing Hospital of China Medical University during January-April 2017. Weka data mining software was used to identify the best algorithm for diabetes classification.

Results: The results indicate that decision tree classifier J48 has the best performance (accuracy = 0.9503, precision = 0.950, recall = 0.950, F-measure = 0.948, and AUC = 0.964). The decision tree structure shows that age is the most significant feature, followed by family history of diabetes, work stress, BMI, salty food preference, physical activity, hypertension, gender, and history of cardiovascular disease or stroke.

Conclusions: Our study shows that decision tree analyses can be applied to screen individuals for early diabetes risk without the need for invasive tests. This procedure will be particularly useful in developing regions with high epidemiological risk and poor socioeconomic status, and enable clinical practitioners to rapidly screen patients for increased risk of diabetes. The key features in the tree structure could further facilitate diabetes prevention through targeted community interventions, which can potentially improve early diabetes diagnosis and reduce burdens on the healthcare system.

Keywords: Data mining; Diabetes; Screening.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adult
  • China
  • Clinical Decision-Making*
  • Data Mining*
  • Decision Support Techniques*
  • Decision Trees*
  • Diabetes Mellitus / diagnosis*
  • Early Diagnosis*
  • Female
  • Humans
  • Male
  • Middle Aged