Uncertainty-based saltwater intrusion prediction using integrated Bayesian machine learning modeling (IBMLM) in a deep aquifer

J Environ Manage. 2024 Mar:354:120252. doi: 10.1016/j.jenvman.2024.120252. Epub 2024 Feb 22.

Abstract

Data-driven machine learning approaches are promising to substitute physically based groundwater numerical models and capture input-output relationships for reducing computational burden. But the performance and reliability are strongly influenced by different sources of uncertainty. Conventional researches generally rely on a stand-alone machine learning surrogate approach and fail to account for errors in model outputs resulting from structural deficiencies. To overcome this issue, this study proposes a flexible integrated Bayesian machine learning modeling (IBMLM) method to explicitly quantify uncertainties originating from structures and parameters of machine learning surrogate models. An Expectation-Maximization (EM) algorithm is combined with Bayesian model averaging (BMA) to find out maximum likelihood and construct posterior predictive distribution. Three machine learning approaches representing different model complexity are incorporated in the framework, including artificial neural network (ANN), support vector machine (SVM) and random forest (RF). The proposed IBMLM method is demonstrated in a field-scale real-world "1500-foot" sand aquifer, Baton Rouge, USA, where overexploitation caused serious saltwater intrusion (SWI) issues. This study adds to the understanding of how chloride concentration transport responds to multi-dimensional extraction-injection remediation strategies in a sophisticated saltwater intrusion model. Results show that most IBMLM exhibit r values above 0.98 and NSE values above 0.93, both slightly higher than individual machine learning, confirming that the IBMLM is well established to provide better model predictions than individual machine learning models, while maintaining the advantage of high computing efficiency. The IBMLM is found useful to predict saltwater intrusion without running the physically based numerical simulation model. We conclude that an explicit consideration of machine learning model structure uncertainty along with parameters improves accuracy and reliability of predictions, and also corrects uncertainty bounds. The applicability of the IBMLM framework can be extended in regions where a physical hydrogeologic model is difficult to build due to lack of subsurface information.

Keywords: Aquifer salinization; Bayesian model averaging; Machine learning approach; SEAWAT model; Uncertainty quantification.

MeSH terms

  • Bayes Theorem
  • Groundwater* / chemistry
  • Machine Learning
  • Reproducibility of Results
  • Uncertainty