AI-driven wastewater management through comparative analysis of feature selection techniques and predictive models


Dikmen F., DEMİR A., Özkaya B., Raza M. O., Rasheed J., Asuroglu T., ...Daha Fazla

Scientific Reports, cilt.15, sa.1, 2025 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 15 Sayı: 1
  • Basım Tarihi: 2025
  • Doi Numarası: 10.1038/s41598-025-07124-0
  • Dergi Adı: Scientific Reports
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, BIOSIS, Chemical Abstracts Core, MEDLINE, Veterinary Science Database, Directory of Open Access Journals
  • Anahtar Kelimeler: Artificial intelligence, Environmental engineering, Feature selection, Machine learning, Waste water treatment plan
  • Yıldız Teknik Üniversitesi Adresli: Evet

Özet

The integration of artificial intelligence (AI) in wastewater treatment management offers a promising approach to optimizing effluent quality predictions and enhancing operational efficiency. This study evaluates the performance of machine learning models in predicting key wastewater effluent parameters Chemical Oxygen Demand (COD), Biochemical Oxygen Demand (BOD), Total Suspended Solids (TSS), Total Effluent Nitrogen and Total Effluent Phosphorus. Three feature selection techniques were applied: SelectKBest, Mutual Information, and Recursive Feature Elimination (RFE) using Random Forest to identify the most significant predictors. The study leveraged ensemble learning models, including XGBoost, Random Forest, Gradient Boosting, and LightGBM, and compared them with Decision Tree models. The results demonstrate that effluent volatile suspended solids (VSS) consistently held the highest predictive importance across all feature selection methods. Ensemble models significantly outperformed Decision Trees, with Gradient Boosting achieving the best predictive accuracy for TSS and total nitrogen (Mean Absolute Error (MAE): 3.667 : 97.53), XGBoost excelling in COD prediction with MAE and of 6.251 and 83. 41%, respectively, and XGBoost showing superior performance for BOD (MAE: 1.589 :79.64%). LightGBM yielded the highest precision in predicting total phosphate with MAE and a score of 0.230 and 28. 68%, respectively. Decision tree models consistently underperformed, exhibiting the highest error rates. These findings highlight the potential of AI-driven approaches in wastewater management to improve decision-making, regulatory compliance, and resource efficiency. However, limitations such as operational irregularities and seasonal variations remain challenges for further refinement.