Accurate discharge and water level forecasting using ensemble learning with genetic algorithm and singular spectrum analysis-based denoising

Sci Rep. 2022 Nov 18;12(1):19870. doi: 10.1038/s41598-022-22057-8.


Forecasting discharge (Q) and water level (H) are essential factors in hydrological research and flood prediction. In recent years, deep learning has emerged as a viable technique for capturing the non-linear relationship of historical data to generate highly accurate prediction results. Despite the success in various domains, applying deep learning in Q and H prediction is hampered by three critical issues: a shortage of training data, the occurrence of noise in the collected data, and the difficulty in adjusting the model's hyper-parameters. This work proposes a novel deep learning-based Q-H prediction model that overcomes all the shortcomings encountered by existing approaches. Specifically, to address data scarcity and increase prediction accuracy, we design an ensemble learning architecture that takes advantage of multiple deep learning techniques. Furthermore, we leverage the Singular-Spectrum Analysis (SSA) to remove noise and outliers from the original data. Besides, we exploit the Genetic Algorithm (GA) to propose a novel mechanism that can automatically determine the prediction model's optimal hyper-parameters. We conducted extensive experiments on two datasets collected from Vietnam's Red and Dakbla rivers. The results show that our proposed solution outperforms current techniques across a wide range of metrics, including NSE, MSE, MAE, and MAPE. Specifically, by exploiting the ensemble learning technique, we can improve the NSE by at least [Formula: see text]. Moreover, with the aid of the SSA-based data preprocessing technique, the NSE is further enhanced by more than [Formula: see text]. Finally, thanks to GA-based optimization, our proposed model increases the NSE by at least [Formula: see text] and up to [Formula: see text] in the best case.