Machine Learning Predictions of pH in the Glacial Aquifer System, Northern USA

Ground Water. 2021 May;59(3):352-368. doi: 10.1111/gwat.13063. Epub 2020 Dec 31.


A boosted regression tree model was developed to predict pH conditions in three dimensions throughout the glacial aquifer system of the contiguous United States using pH measurements in samples from 18,386 wells and predictor variables that represent aspects of the hydrogeologic setting. Model results indicate that the carbonate content of soils and aquifer materials strongly controls pH and, when coupled with long flowpaths, results in the most alkaline conditions. Conversely, in areas where glacial sediments are thin and carbonate-poor, pH conditions remain acidic. At depths typical of drinking-water supplies, predicted pH >7.5-which is associated with arsenic mobilization-occurs more frequently than predicted pH <6-which is associated with water corrosivity and the mobilization of other trace elements. A novel aspect of this model was the inclusion of numerically based estimates of groundwater flow characteristics (age and flowpath length) as predictor variables. The sensitivity of pH predictions to these variables was consistent with hydrologic understanding of groundwater flow systems and the geochemical evolution of groundwater quality. The model was not developed to provide precise estimates of pH at any given location. Rather, it can be used to more generally identify areas where contaminants may be mobilized into groundwater and where corrosivity issues may be of concern to prioritize areas for future groundwater monitoring.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Arsenic* / analysis
  • Environmental Monitoring
  • Groundwater*
  • Hydrogen-Ion Concentration
  • Machine Learning
  • United States
  • Water Pollutants, Chemical* / analysis


  • Water Pollutants, Chemical
  • Arsenic