Construction and Early Warning Research of Water Quality Pollution Grading Prediction Model Driven by Multi-Source Data
Download as PDF
DOI: 10.25236/icceme.2025.008
Author(s)
Zhangxin Huang, Liangfan Lin
Corresponding Author
Zhangxin Huang
Abstract
With the rapid advancement of urbanization and industrialization, the problem of water environmental pollution has become increasingly prominent. Traditional water quality evaluation methods mostly rely on single indicators or artificial experience, making it difficult to comprehensively reflect the multi-dimensional characteristics and dynamic evolution process of pollution. To this end, based on typical national water quality monitoring data, this paper constructs a water quality pollution classification modeling system that integrates dimensionality reduction analysis, ensemble learning and automatic parameter adjustment. Feature dimensionality reduction is achieved through data preprocessing and principal component analysis (PCA), and composite indicators such as nitrogen-phosphorus ratio and oxygen demand intensity ratio are constructed in combination with ecological principles to enhance the explanatory power of variables. A multi-class classification model was constructed by adopting an integrated strategy of XGBoost, CatBoost, LightGBM and Stacking. Spatio-temporal and dynamic features were introduced to enhance the trend perception ability, and hyperparameters were optimized to improve the model stability. The experimental results show that the accuracy rate of the model in the five-level pollution classification task reaches 0.77, and that of Macro-F1 is 0.73, which is superior to the single model. This study proposes an ecologically-driven compound variable system, a multi-model integrated optimization framework, and a "prediction label + probability threshold" dual-trigger early warning mechanism, which are both interpretable and practical, and can provide technical support for intelligent monitoring, risk early warning, and smart governance of water environment.
Keywords
Water Pollution; Classification Model; PCA Dimensionality Reduction; Stacking Integration