Statistical modeling of global geogenic arsenic contamination in groundwater

Environ Sci Technol. 2008 May 15;42(10):3669-75. doi: 10.1021/es702859e.


Contamination of groundwaters with geogenic arsenic poses a major health risk to millions of people. Although the main geochemical mechanisms of arsenic mobilization are well understood, the worldwide scale of affected regions is still unknown. In this study we used a large database of measured arsenic concentration in groundwaters (around 20,000 data points) from around the world as well as digital maps of physical characteristics such as soil, geology, climate, and elevation to model probability maps of global arsenic contamination. A novel rule-based statistical procedure was used to combine the physical data and expert knowledge to delineate two process regions for arsenic mobilization: "reducing" and "high-pH/ oxidizing". Arsenic concentrations were modeled in each region using regression analysis and adaptive neuro-fuzzy inferencing followed by Latin hypercube sampling for uncertainty propagation to produce probability maps. The derived global arsenic models could benefit from more accurate geologic information and aquifer chemical/physical information. Using some proxy surface information, however, the models explained 77% of arsenic variation in reducing regions and 68% of arsenic variation in high-pH/oxidizing regions. The probability maps based on the above models correspond well with the known contaminated regions around the world and delineate new untested areas that have a high probability of arsenic contamination. Notable among these regions are South East and North West of China in Asia, Central Australia, New Zealand, Northern Afghanistan, and Northern Mali and Zambia in Africa.

MeSH terms

  • Arsenic / analysis*
  • Models, Statistical*
  • Probability
  • Reproducibility of Results
  • Water Pollutants, Chemical / analysis*


  • Water Pollutants, Chemical
  • Arsenic