Machine Learning Algorithms to Predict Colistin-Induced Nephrotoxicity from Electronic Health Records in Patients with Multidrug-Resistant Gram-Negative Infection

Int J Antimicrob Agents. 2024 Apr 18:107175. doi: 10.1016/j.ijantimicag.2024.107175. Online ahead of print.

Abstract

Objectives: Colistin-induced nephrotoxicity prolongs hospitalization and increases mortality. The study aimed to construct machine learning models to predict colistin-induced nephrotoxicity in patients with multidrug-resistant gram-negative infection.

Methods: Patients receiving colistin from three hospitals in the Clinical Research Database were included. Data were divided into a derivation cohort (2011∼2017) and a temporal validation cohort (2018∼2020). Fifteen machine learning models were established by categorical boosting, light gradient boosting machine, and random forest. Classifier performances were compared by the sensitivity, F1 score, Matthews correlation coefficient (MCC), area under the receiver operating characteristic (AUROC) curve, and area under the precision-recall curve (AUPRC). SHapley Additive exPlanations plots were drawn to understand feature importance and interactions.

Results: The study included 1392 patients, with 360 (36.4%) and 165 (40.9%) experiencing nephrotoxicity in the derivation and temporal validation cohorts, respectively. The categorical boosting with oversampling achieved the highest performance with a sensitivity of 0.860, an F1 score of 0.740, an MCC of 0.533, an AUROC curve of 0.823, and an AUPRC of 0.737. The feature importance demonstrated that the days of colistin use, cumulative dose, daily dose, latest C-reactive protein, and baseline hemoglobin were the most important risk factors, especially for vulnerable patients. A cutoff colistin dose of 4.0 mg/kg body weight/day was identified for patients at higher risk of nephrotoxicity.

Conclusions: Machine learning techniques can be an early identification tool to predict colistin-induced nephrotoxicity. The observed interactions suggest a modification in dose adjustment guidelines. Future geographic and prospective validation studies are warranted to strengthen the real-world applicability.

Keywords: Catboost; colistin; machine learning; nephrotoxicity; resampling; support vector machine.