Background: The aim of this study was to develop a predictive model to classify people with type 2 diabetes (T2D) into expected levels of success upon bolus insulin initiation.
Methods: Machine learning methods were applied to a large nationally representative insurance claims database from the United States (dNHI database; data from 2007 to 2017). We trained boosted decision tree ensembles (XGBoost) to assign people into Class 0 (never meeting HbA1c goal), Class 1 (meeting but not maintaining HbA1c goal), or Class 2 (meeting and maintaining HbA1c goal) based on the demographic and clinical data available prior to initiating bolus insulin. The primary objective of the study was to develop a model capable of determining at an individual level, whether people with T2D are likely to achieve and maintain HbA1c goals. HbA1c goal was defined at <8.0% or reduction of baseline HbA1c by >1.0%.
Results: Of 15 331 people with T2D (mean age, 53.0 years; SD, 8.7), 7800 (50.9%) people met HbA1c goal but failed to maintain that goal (Class 1), 4510 (29.4%) never attained this goal (Class 0), and 3021 (19.7%) people met and maintained this goal (Class 2). Overall, the model's receiver operating characteristic (ROC) was 0.79 with greater performance on predicting those in Class 2 (ROC = 0.92) than those in Classes 0 and 1 (ROC = 0.71 and 0.62, respectively). The model achieved high area under the precision-recall curves for the individual classes (Class 0, 0.46; Class 1, 0.58; Class 2, 0.71).
Conclusions: Predictive modeling using routine health care data reasonably accurately classified patients initiating bolus insulin who would achieve and maintain HbA1c goals, but less so for differentiation between patients who never met and who did not maintain goals. Prior HbA1c was a major contributing parameter for the predictions.
Keywords: HbA1c; insulin; type 2 diabetes.