arXiv:2409.10898v3 Announce Type: replace-cross Abstract: Ensuring safe water supplies requires effective water quality monitoring, especially in developing countries like Nepal, where contamination risks are high. This paper introduces various hybrid deep learning models to predict on the CCME dataset with multiple water quality parameters from Canada, China, the UK, the USA, and Ireland, with 2.82 million data records feature-engineered and evaluated using them. Models such as CatBoost, XGBoost, and Extra Trees, along with neural networks combining CNN and LSTM layers, are used to capture temporal and spatial patterns in the data. The model demonstrated notable accuracy improvements, aiding proactive water quality control. CatBoost, XGBoost, and Extra Trees Regressor predicted Water Quality Index (WQI) values with an average RMSE of 1.2 and an R squared score of 0.99. Additionally, classifiers achieved 99% accuracy, cross-validated across models. SHAP analysis showed the importance of indicators like F.R.C. and orthophosphate levels in hybrid architectures' classification decisions. The practical application is demonstrated along with a chatbot application for water quality insights.
