Machine learning improves prediction of pulmonary thromboembolism and reduces unnecessary computed tomography scans in the emergency department

Sci Rep. 2026 Jan 9;16(1):4935. doi: 10.1038/s41598-025-34952-x.

Abstract

The diagnosis of pulmonary thromboembolism (PTE) remains challenging due to its nonspecific clinical signs and symptoms. This study aimed to develop a machine learning (ML) model to predict PTE in emergency department patients. We retrospectively analyzed 2,525 emergency department patients suspected of PTE who underwent computed tomography pulmonary angiography (CTPA) within 7 days after elevated D-dimer levels (≥ 0.5 µg/ml) at a tertiary hospital, between January 2012 and December 2021. Clinical and laboratory data were split into training (n = 2025) and test (n = 500) sets. Six ML models-XGBoost, random forest, logistic regression, elastic net regression, support vector machine, and feed-forward neural network-were compared with the revised Geneva score using the area under the receiver operating characteristic curve (AUC). Variable importance was assessed using permutation methods. Of the 2,525 patients, 573 (22.7%) were diagnosed with PTE. XGBoost achieved the highest AUC of 0.814 (95% confidence interval [CI]: 0.759-0.862). All ML models outperformed the revised Geneva score, which had an AUC of 0.622 (95% CI: 0.563-0.675). D-dimer and activated partial thromboplastin time were the most important predictors across all ML models. At sensitivities of 100%, 95%, and 90%, the XGBoost model could reduce the number of CTPA scans by 3.0%, 14.8%, and 33.2%, respectively (all p < 0.001). These findings suggest that ML models, particularly XGBoost, can improve PTE risk prediction compared to the revised Geneva score and may help reduce unnecessary CTPA imaging in the emergency department.

Keywords: Clinical prediction rules; Computed tomography pulmonary angiography; Pulmonary embolism; Supervised machine learning.

MeSH terms

  • Adult
  • Aged
  • Computed Tomography Angiography
  • Emergency Service, Hospital*
  • Female
  • Fibrin Fibrinogen Degradation Products / analysis
  • Humans
  • Machine Learning*
  • Male
  • Middle Aged
  • Pulmonary Embolism* / diagnosis
  • Pulmonary Embolism* / diagnostic imaging
  • ROC Curve
  • Retrospective Studies
  • Tomography, X-Ray Computed*
  • Unnecessary Procedures*

Substances

  • Fibrin Fibrinogen Degradation Products
  • fibrin fragment D