Integrated approach for arsenic prediction and health risk evaluation in community tube wells installed by public health department: comparative study of random forest, extreme gradient boosting, and deep neural networks


Ullah I., ARSLAN Ş., Ullah Z., Chen Z., Esteller M. V., Ullah H., ...Daha Fazla

Environmental geochemistry and health, cilt.48, sa.6, 2026 (SCI-Expanded, Scopus) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 48 Sayı: 6
  • Basım Tarihi: 2026
  • Doi Numarası: 10.1007/s10653-026-03150-7
  • Dergi Adı: Environmental geochemistry and health
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, BIOSIS, Chemical Abstracts Core, Compendex, Environment Index, Geobase, INSPEC, MEDLINE
  • Anahtar Kelimeler: DNN, Hydrogeochemistry, Public health risk, Water quality, XGBoost
  • Ankara Üniversitesi Adresli: Evet

Özet

Arsenic contamination in groundwater poses a significant public health risk, especially in developing countries with inadequate water quality monitoring systems. This study employs advanced machine learning approaches to assess arsenic pollution in 216 community tube well samples across Sindh Province, Pakistan. Three algorithms Random Forest (RF), XGBoost, and Deep Neural Network (DNN) were applied alongside hydrochemical analysis to identify major contamination drivers. Water Quality Index and arsenic distribution mapping revealed concerning contamination levels, while a health risk model evaluated potential threats from arsenic exposure. Arsenic concentrations ranged from 0.5 to 100 µg/L (mean: 16.2 µg/L), with 48% of samples exceeding WHO's safe limit of 10 µg/L. Hydrochemical analysis indicated that 46% of water samples were Ca-HCO₃ type, while 54% were mixed Ca-Mg-Cl type. Gibbs plots confirmed that groundwater chemistry is primarily influenced by rock weathering. Among the ML models, the DNN demonstrated superior performance (AUC: 0.978), outperforming RF (0.943) and XGBoost (0.937). Feature importance identified 13 critical factors affecting As contamination: TDS, HCO3-, Fe, EC, Cl-, K+, Ca2+, Mg2+, SO42-, NO3-, F-, and well depth. WQI analysis revealed that only 3.75% of samples met acceptable standards, 43.98% were of moderate quality, and 51.85% were poor in quality. Health risk assessments underscored severe threats, particularly to children, with Daily Metal Intake values far exceeding those of adults, Hazard Quotient values reaching up to four times the safety threshold and Cancer Risk estimates surpassing EPA limits. These findings emphasize the urgent need for water treatment solutions and monitoring systems to mitigate arsenic exposure in affected communities.