The Development of a Machine Learning Inpatient Acute Kidney Injury Prediction Model

Crit Care Med. 2018 Jul;46(7):1070-1077. doi: 10.1097/CCM.0000000000003123.


Objectives: To develop an acute kidney injury risk prediction model using electronic health record data for longitudinal use in hospitalized patients.

Design: Observational cohort study.

Setting: Tertiary, urban, academic medical center from November 2008 to January 2016.

Patients: All adult inpatients without pre-existing renal failure at admission, defined as first serum creatinine greater than or equal to 3.0 mg/dL, International Classification of Diseases, 9th Edition, code for chronic kidney disease stage 4 or higher or having received renal replacement therapy within 48 hours of first serum creatinine measurement.

Interventions: None.

Measurements and main results: Demographics, vital signs, diagnostics, and interventions were used in a Gradient Boosting Machine algorithm to predict serum creatinine-based Kidney Disease Improving Global Outcomes stage 2 acute kidney injury, with 60% of the data used for derivation and 40% for validation. Area under the receiver operator characteristic curve (AUC) was calculated in the validation cohort, and subgroup analyses were conducted across admission serum creatinine, acute kidney injury severity, and hospital location. Among the 121,158 included patients, 17,482 (14.4%) developed any Kidney Disease Improving Global Outcomes acute kidney injury, with 4,251 (3.5%) developing stage 2. The AUC (95% CI) was 0.90 (0.90-0.90) for predicting stage 2 acute kidney injury within 24 hours and 0.87 (0.87-0.87) within 48 hours. The AUC was 0.96 (0.96-0.96) for receipt of renal replacement therapy (n = 821) in the next 48 hours. Accuracy was similar across hospital settings (ICU, wards, and emergency department) and admitting serum creatinine groupings. At a probability threshold of greater than or equal to 0.022, the algorithm had a sensitivity of 84% and a specificity of 85% for stage 2 acute kidney injury and predicted the development of stage 2 a median of 41 hours (interquartile range, 12-141 hr) prior to the development of stage 2 acute kidney injury.

Conclusions: Readily available electronic health record data can be used to predict impending acute kidney injury prior to changes in serum creatinine with excellent accuracy across different patient locations and admission serum creatinine. Real-time use of this model would allow early interventions for those at high risk of acute kidney injury.

Publication types

  • Observational Study

MeSH terms

  • Acute Kidney Injury / diagnosis
  • Acute Kidney Injury / etiology*
  • Algorithms
  • Area Under Curve
  • Creatinine / blood
  • Electronic Health Records
  • Female
  • Humans
  • Machine Learning*
  • Male
  • Middle Aged
  • Models, Statistical
  • ROC Curve
  • Renal Replacement Therapy / statistics & numerical data
  • Reproducibility of Results


  • Creatinine