Improved prediction of extreme rainfall using a machine learning approach
-
Graphical Abstract
-
Abstract
Under global warming, extreme precipitation in the Yellow River Basin (YRB) has shown an increasing trend, posing serious threats to regional socioeconomic development and the environment. We used empirical orthogonal function (EOF) and year-by-year incremental methods to extract the spatial patterns and principal components (PCs) of total summer extreme precipitation increments (R95pTOT-DY) in the YRB from 1962 to 2010. Prediction models of PCs during 2011–2022 were autonomously established using two sets of machine learning (ML) models (LightGBM and CatBoost). Without human intervention, these models independently identified significant climatic factors from 114 ocean and circulation indices advanced by 1–6 months. LightGBM predicted time correlation coefficients (TCCs) of 0.53, 0.63, and 0.60 for the PCs, while CatBoost yielded TCCs of 0.50, 0.69, and 0.57, respectively. Notably, the PC3 prediction significantly outperformed traditional linear regression (LR). Improved R95pTOT-DY prediction was observed in the YRB's source area and middle reach. The ensemble model of LightGBM and CatBoost showed the best R95pTOT-DY prediction, with reasonable performance in 12 years of forecasting R95pTOT, surpassing the traditional LR method. The SHAP methodology was used to analyze the sources of the model's performance, attributing the enhancement in PC3 specifically to the contributions of the North American zonal polar vortex area index (NAPVAI) to the accurate prediction of extreme precipitation events. Overall, the ML-based incremental model improves the prediction of extreme precipitation in the YRB, offering valuable insights for risk assessment and timely warnings for summer rainstorms in this critical region.
-
-