Bias in Prediction Models to Identify Patients With Colorectal Cancer at High Risk for Readmission After Resection

JCO Clin Cancer Inform. 2024 Nov:8:e2300194. doi: 10.1200/CCI.23.00194. Epub 2024 Oct 9.

Abstract

Purpose: Machine learning algorithms are used for predictive modeling in medicine, but studies often do not evaluate or report on the potential biases of the models. Our purpose was to develop clinical prediction models for readmission after surgery in colorectal cancer (CRC) patients and to examine their potential for racial bias.

Methods: We used the 2012-2020 American College of Surgeons' National Surgical Quality Improvement Program (ACS-NSQIP) Participant Use File and Targeted Colectomy File. Patients were categorized into four race groups - White, Black or African American, Other, and Unknown/Not Reported. Potential predictive features were identified from studies of risk factors of 30-day readmission in CRC patients. We compared four machine learning-based methods - logistic regression (LR), multilayer perceptron (MLP), random forest (RF), and XGBoost (XGB). Model bias was assessed using false negative rate (FNR) difference, false positive rate (FPR) difference, and disparate impact.

Results: In all, 112,077 patients were included, 67.2% of whom were White, 9.2% Black, 5.6% Other race, and 18% with race not recorded. There were significant differences in the AUROC, FPR and FNR between race groups across all models. Notably, patients in the 'Other' race category had higher FNR compared to Black patients in all but the XGB model, while Black patients had higher FPR than White patients in some models. Patients in the 'Other' category consistently had the lowest FPR. Applying the 80% rule for disparate impact, the models consistently met the threshold for unfairness for the 'Other' race category.

Conclusion: Predictive models for 30-day readmission after colorectal surgery may perform unequally for different race groups, potentially propagating to inequalities in delivery of care and patient outcomes if the predictions from these models are used to direct care.

MeSH terms

  • Aged
  • Bias
  • Colectomy / methods
  • Colorectal Neoplasms* / diagnosis
  • Colorectal Neoplasms* / surgery
  • Female
  • Humans
  • Logistic Models
  • Machine Learning*
  • Male
  • Middle Aged
  • Patient Readmission* / statistics & numerical data
  • Risk Assessment / methods
  • Risk Factors