Estimation of Surgery Durations Using Machine Learning Methods-A Cross-Country Multi-Site Collaborative Study

Healthcare (Basel). 2022 Jun 25;10(7):1191. doi: 10.3390/healthcare10071191.

Abstract

The scheduling of operating room (OR) slots requires the accurate prediction of surgery duration. We evaluated the performance of existing Moving Average (MA) based estimates with novel machine learning (ML)-based models of surgery durations across two sites in the US and Singapore. We used the Duke Protected Analytics Computing Environment (PACE) to facilitate data-sharing and big data analytics across the US and Singapore. Data from all colorectal surgery patients between 1 January 2012 and 31 December 2017 in Singapore and, 1 January 2015 to 31 December 2019 in the US were used, and 7585 cases and 3597 single and multiple procedure cases from Singapore and US were included. The ML models were based on categorical gradient boosting (CatBoost) models trained on common data fields shared by both institutions. The procedure codes were based on the Table of Surgical Procedure (TOSP) (Singapore) and the Current Procedural Terminology (CPT) codes (US). The two types of codes were mapped by surgical experts. The CPT codes were then transformed into the relative value unit (RVU). The ML models outperformed the baseline MA models. The MA, scheduled durations and procedure codes were found to have higher loadings as compared to surgeon factors. We further demonstrated the use of the Duke PACE in facilitating data-sharing and big data analytics.

Keywords: big data analytics; data sharing; machine learning; multi-country; multi-site; surgical durations.