Towards proactive palliative care in oncology: developing an explainable EHR-based machine learning model for mortality risk prediction

Qingyuan Zhuang; Alwin Yaoxian Zhang; Ryan Shea Tan Ying Cong; Grace Meijuan Yang; Patricia Soek Hui Neo; Daniel Sw Tan; Melvin Lk Chua; Iain Beehuat Tan; Fuh Yong Wong; Marcus Eng Hock Ong; Sean Shao Wei Lam; Nan Liu

doi:10.1186/s12904-024-01457-9

Towards proactive palliative care in oncology: developing an explainable EHR-based machine learning model for mortality risk prediction

BMC Palliat Care. 2024 May 20;23(1):124. doi: 10.1186/s12904-024-01457-9.

Authors

Qingyuan Zhuang^{1

2}, Alwin Yaoxian Zhang³, Ryan Shea Tan Ying Cong^{4

5}, Grace Meijuan Yang^{3

6}, Patricia Soek Hui Neo³, Daniel Sw Tan^{4

7}, Melvin Lk Chua^{5

8}, Iain Beehuat Tan^{4

5

9}, Fuh Yong Wong^{8

10}, Marcus Eng Hock Ong^{11

12}, Sean Shao Wei Lam^{11

12}, Nan Liu^{13

12}

Affiliations

¹ Division of Supportive and Palliative Care, National Cancer Centre Singapore, 30 Hospital Blvd, Singapore, 168583, Singapore. zhuang.qingyuan@singhealth.com.sg.
² Data Computational Science Core, National Cancer Centre Singapore, Singapore, Singapore. zhuang.qingyuan@singhealth.com.sg.
³ Division of Supportive and Palliative Care, National Cancer Centre Singapore, 30 Hospital Blvd, Singapore, 168583, Singapore.
⁴ Division of Medical Oncology, National Cancer Centre Singapore, Singapore, Singapore.
⁵ Data Computational Science Core, National Cancer Centre Singapore, Singapore, Singapore.
⁶ Lien Centre of Palliative Care, Duke-NUS Medical School, Singapore, Singapore.
⁷ Division of Clinical Trials and Epidemiological Sciences, National Cancer Centre Singapore, Singapore, Singapore.
⁸ Division of Radiation Oncology, National Cancer Centre Singapore, Singapore, Singapore.
⁹ Saw Swee Hock School of Public Health, National University of Singapore, Singapore, Singapore.
¹⁰ Department of Cancer Informatics, National Cancer Centre Singapore, Singapore, Singapore.
¹¹ Health Services Research Centre, SingHealth, Singapore.
¹² Program in Health Services and Systems Research, Duke-NUS Medical School, Singapore, Singapore.
¹³ Centre for Quantitative Medicine, Duke-NUS Medical School, Singapore, Singapore.

Abstract

Background: Ex-ante identification of the last year in life facilitates a proactive palliative approach. Machine learning models trained on electronic health records (EHR) demonstrate promising performance in cancer prognostication. However, gaps in literature include incomplete reporting of model performance, inadequate alignment of model formulation with implementation use-case, and insufficient explainability hindering trust and adoption in clinical settings. Hence, we aim to develop an explainable machine learning EHR-based model that prompts palliative care processes by predicting for 365-day mortality risk among patients with advanced cancer within an outpatient setting.

Methods: Our cohort consisted of 5,926 adults diagnosed with Stage 3 or 4 solid organ cancer between July 1, 2017, and June 30, 2020 and receiving ambulatory cancer care within a tertiary center. The classification problem was modelled using Extreme Gradient Boosting (XGBoost) and aligned to our envisioned use-case: "Given a prediction point that corresponds to an outpatient cancer encounter, predict for mortality within 365-days from prediction point, using EHR data up to 365-days prior." The model was trained with 75% of the dataset (n = 39,416 outpatient encounters) and validated on a 25% hold-out dataset (n = 13,122 outpatient encounters). To explain model outputs, we used Shapley Additive Explanations (SHAP) values. Clinical characteristics, laboratory tests and treatment data were used to train the model. Performance was evaluated using area under the receiver operating characteristic curve (AUROC) and area under the precision-recall curve (AUPRC), while model calibration was assessed using the Brier score.

Results: In total, 17,149 of the 52,538 prediction points (32.6%) had a mortality event within the 365-day prediction window. The model demonstrated an AUROC of 0.861 (95% CI 0.856-0.867) and AUPRC of 0.771. The Brier score was 0.147, indicating slight overestimations of mortality risk. Explanatory diagrams utilizing SHAP values allowed visualization of feature impacts on predictions at both the global and individual levels.

Conclusion: Our machine learning model demonstrated good discrimination and precision-recall in predicting 365-day mortality risk among individuals with advanced cancer. It has the potential to provide personalized mortality predictions and facilitate earlier integration of palliative care.

Keywords: Clinical decision support systems; Electronic Health Records; Machine learning; Oncology; Palliative Medicine.

MeSH terms

Adult
Aged
Aged, 80 and over
Cohort Studies
Electronic Health Records* / statistics & numerical data
Female
Humans
Machine Learning* / standards
Male
Medical Oncology / methods
Medical Oncology / standards
Middle Aged
Mortality / trends
Neoplasms / mortality
Neoplasms / therapy
Palliative Care* / methods
Palliative Care* / standards
Palliative Care* / statistics & numerical data
Risk Assessment / methods

Abstract

MeSH terms

Grants and funding