HGSORF: Henry Gas Solubility Optimization-based Random Forest for C-Section prediction and XAI-based cause analysis

Comput Biol Med. 2022 Aug:147:105671. doi: 10.1016/j.compbiomed.2022.105671. Epub 2022 May 30.


A stable predictive model is essential for forecasting the chances of cesarean or C-section (CS) delivery, as unnecessary CS delivery can adversely affect neonatal, maternal, and pediatric morbidity and mortality, and can incur significant financial burdens. Limited state-of-the-art machine learning models have been applied in this area in recent years, and the current models are insufficient to correctly predict the probability of CS delivery. To alleviate this drawback, we have proposed a Henry gas solubility optimization (HGSO)-based random forest (RF), with an improved objective function, called HGSORF, for the classification of CS and non-CS classes. Real-world CS datasets can be noisy, such as the Pakistan Demographic and Health Survey (PDHS) dataset used in this study. The HGSO can provide fine-tuned hyperparameters of RF by avoiding local minima points. To compare performance, Gaussian Naive Bayes (GNB), linear discriminant analysis (LDA), K-nearest neighbors (KNN), gradient boosting classifier (GBC), and logistic regression (LR) have been considered in this research. The ADAptive SYNthetic (ADASYN) algorithm has been used to balance the model, and the proposed HGSORF has been compared with other classifiers as well as with other studies. The superior performance was achieved by HGSORF with an accuracy of 98.33% for the PDHS dataset. The hyperparameters of RF have also been optimized by using commonly used hyperparameter-optimization algorithms, and the proposed HGSORF provided comparatively better performance. Additionally, to analyze the causes of CS and their significance, the HGSORF is explained locally and globally using eXplainable artificial intelligence (XAI)-based tools such as SHapely Additive exPlanation (SHAP) and Local Interpretable Model-Agnostic Explanations (LIME). A decision support system has been developed as a potential application to support clinical staffs. All pre-trained models and relevant codes are available on: https://github.com/MIrazul29/HGSORF_CSection.

Keywords: ADASYN; Cesarean section; HGSORF; Hyperparameter optimization; LIME; Machine learning; SHAP; XAI.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Artificial Intelligence*
  • Bayes Theorem
  • Child
  • Humans
  • Infant, Newborn
  • Machine Learning*
  • Solubility