-
In the deep learning model developed for the present study, we utilize the SSTAs at the initial time and the SSTA tendency (55°S–70°N, 0°–360°E), defined by the difference between the SSTA in the initial month and that at a two-month lead, as the predictors (Ham et al., 2019; Wang et al., 2020). The SSTA tendency includes the signals of SSTA evolution and ocean dynamics (Enfield and Mayer, 1997; Lau and Nath, 2003; McPhaden, 2012), which replaces upper ocean heat content anomalies often used in previous studies (Ham et al., 2019; Feng et al., 2022).
To forecast the two-dimensional structure of SSTAs, we first train our CNN model to predict the principal components of three leading empirical orthogonal function (EOF) modes of the tropical Pacific, and then reconstruct the predicted coefficients into two-dimensional SSTAs with the predefined EOF modes. We select the three leading EOF modes of the observed SSTAs in the tropical Pacific (10°S–10°N, 120°E–80°W), and the period from 1960 to 1986 to avoid future information leaking into the prediction system, which together can explain around 87% of the total SSTA variance [Fig. S1 in the electronic supplementary material (ESM)]. Comparison of the prediction skills of commonly used CNN models (Fig. 1) suggests that the VGG-11 model has obvious advantages for the present study, and thus this model is selected. Further details on the reason for choosing the VGG-11 model, the EOF decomposition and reconstruction, and other aspects of training the CNN model, can be found in section 2.3.
-
The historical simulations of 39 climate models participating in CMIP6 (Eyring et al., 2016) for the period 1948–2014 are used to train the CNN model [Table S1 in the electronic supplementary material (ESM)]. Previous studies have concluded that CMIP6 models offer considerable improvements in simulating the properties of ENSO relative to their CMIP5 counterparts (Eyring et al., 2019; Zelinka et al., 2020). The data during the period 1984–2009 are used to calculate the climatology and anomalies. Then, the SSTAs in all CMIP6 models are projected onto the three observed EOF modes, and the projection coefficients are treated as the predictands for training the CNN model. The monthly COBE-SST2 dataset (Hirahara et al., 2014) from 1984 to 2019 is used as the observation to provide the initial values, validate the prediction skill, and develop the fusion model. To compare with the prediction skill of the CNN model, the prediction results of six NMME models (Table S2 in the ESM) from 1986 to 2019 are also used (Kirtman et al., 2014), which represent the state-of-the-art in dynamical models for ENSO prediction. The NCEP-CFSv2 model is excluded in calculating the MME because of its short lead time. We also evaluated the skills between the CNN and MME of dynamical models for a shorter lead time by adding NCEP-CFSv2, and found no considerable change in skill (Fig. S2 in the ESM). All datasets are interpolated to a 2.5° × 2.5° spatial resolution, and the temporal signal below three months is removed using a Butterworth filter for comparison with previous studies. The result using the Butterworth filter is similar to that of the running average, so does not influence the conclusions.
-
The CNN model can capture the teleconnection of ENSO well (Ham et al., 2019), and is suitable to the limited sample size of the CMIP6 simulation dataset. Some architectures of CNN models, such as UNet, can predict spatial fields (Prabhat et al., 2021). Taylor and Feng (2022) achieved improvement in two-dimensional SST prediction using UNet-LSTM, but our preliminary study (not shown) indicates that a neural network with the UNet encoder–decoder module is not suitable for identifying the central location of ENSO. Thus, the CNN method with the one-dimensional output is selected in the present study to construct the deep learning model for ENSO prediction. To fit the one-dimensional output of the CNN model, we first predict the coefficients of three leading spatial modes of the tropical Pacific SSTAs, and then reconstruct two-dimensional patterns of SSTAs by using corresponding EOF modes.
The leading spatial modes of the tropical Pacific SSTAs are extracted by EOF analysis performed on the observed SSTAs in the equatorial Pacific for the period from 1960 to 1986 (10°S–10°N, 120°E–80°W) (Wilks, 2019). We select the first three EOF modes (Fig. S1), which explain 86.65% of the total SSTA variance, almost all large-scale interannual variabilities, and are unbiased to any types of ENSO (Takahashi et al., 2011). We choose the mean square error as the cost function and train the CNN model through iteration to minimize the error between the prediction and the true coefficients. An individual model is trained to optimize each coefficient. The projection coefficients of the SSTAs in CMIP6 models and in the CNN model output are normalized to have a mean of 0 and a standard deviation of 1.
We preliminarily test five commonly used CNN models—namely, VGG-11 (Simonyan and Zisserman, 2015), MobileNet, MobileNet-V2 (Sandler et al., 2018), DenseNet121 (Huang et al., 2017) and ResNet18 (He et al., 2016)—in the prediction with the same hyperparameters. The VGG-11 model is selected for further study because of its obvious advantages among the five models (Fig. 1). The VGG-11 model (Simonyan and Zisserman, 2015) is a classic CNN model with simple architecture composed of a 3 × 3 grid convolutional filter, pooling, and ReLU activation function (Fig. S3 in the ESM). The VGG-11 model has the smallest network capacity, i.e., model layers and number of parameters in the hidden layers, compared to the other four models, which prevents the problem of overfitting (Fig. 1). On the other hand, the larger training data spatial resolution (2.5° × 2.5°) used in the present study provides a larger network capacity than previously (Ham et al., 2019). Activation functions, such as Sigmoid, tanh and ReLU play a critical role in introducing nonlinearity to CNN models.
The ReLU function is included in the present CNN model, which is used to predict the coefficients of the three EOF modes one year ahead. We train 33 CNN submodels individually for the 11 lead months and three coefficients of EOF modes as an ensemble member. In order to eliminate the impacts of the initialized model weights on training, we train 10 members with slightly different and random initialized weights, and then the ensemble results of the 10 members are treated as the results of the prediction. The hyperparameters in the VGG-11 model are set as: mini-batch size = 128; epoch number = 35; initial learning rate = 0.0001; warm-up training schedule used in the first five epochs; and stochastic gradient descent with momentum for the optimizer. Batch normalization is applied for faster and more stable training. An early stop training strategy is used to prevent overfitting of the model.
After the coefficients for the EOF modes have been predicted at a certain lead time, we can reconstruct the two-dimensional pattern of SSTAs using the predicted coefficients and the associated EOF modes. Then, the reconstructed two-dimensional SSTAs serve as the final prediction for the tropical Pacific SSTAs at that lead time. There is no transfer learning applied in the present CNN model, which makes it different from previous linear reverse models based on observations (Newman and Sardeshmukh, 2017). The skill of the predicted SSTAs is evaluated with the full SSTA of the observation.
-
To further evaluate the prediction skill for the pattern of SSTAs during CP and EP ENSO events, we calculate the zonal SSTA center during the four types of ENSO events, i.e., EP El Niño, CP El Niño, EP La Niña, and CP La Niña. Although the spatial differences between the EP and CP La Niñas are not as apparent as those between the EP and CP El Ninos, the La Niñas are still divided into two types owing to the overall differences in amplitude, evolution, precursor, and location (Capotondi et al., 2015). These events are distinguished by modifying a unified complex ENSO index (UCEI) (Zhang et al., 2019):
where
in which
$ {N}_{3} $ and$ {N}_{4} $ denote the Niño-3 and Niño-4 indices, respectively. EP La Niña is defined for$-180^\circ < \theta < -90^\circ$ ; CP La Niña is defined for$-270^\circ < \theta < -180^\circ$ ; EP El Niño is defined for$0^\circ < \theta < 90^\circ$ , while CP El Niño is defined for$-90^\circ < \theta < 0^\circ$ . The classification of ENSO events is shown in Table 1.Type Events–DJF Number CP El Niño 1987/88, 1994/95, 2002/03, 2004/05, 2006/07, 2009/10, 2014/15, 2018/19 8 EP El Niño 1986/87, 1991/92, 1997/98, 2015/16 4 CP La Niña 1988/89, 1998/99, 2000/01, 2008/09, 2010/11, 2011/12 6 EP La Niña 1995/96, 1996/97, 1999/2000, 2005/06, 2007/08, 2017/18 6 Table 1. CP El Niño, EP El Niño, CP La Niña, and EP La Niña events during 1986–2019.
Then, we calculate the zonal center location of the equatorial (averaged within 5°S–5°N) profiles of the predicted Pacific SSTA for the four types of events. The zonal center location is defined as the longitude with the maximum of the zonal SSTA profiles for the positive events and the minimum for the negative events, and a three-point moving average is applied to the zonal profile of SSTAs. The composite locations are evaluated because it is hard to distinguish a single center in one individual event owing to the small-scale variations. As shown in Fig. S4 in the ESM, however, the NMME models cannot predict the large-scale pattern of SSTAs for some events at a long lead time. For example, the CNN model correctly forecasts the 1994/95 boreal winter as CP El Niño, whereas the NMME models forecast a La Niña event (Fig. S4a in the ESM). In this case, there is actually not a real center for the predicted SSTAs. Thus, we exclude the predictions with low skill for the large-scale pattern of SSTAs before evaluating the center location. The predictions whose pattern correlations with the observed SSTAs are less than 0.3 are replaced by the average center of other events at each lead time. The percentages of the selected events in calculating the zonal centers for each lead month are shown in Tables S3 and S4 in the ESM. The bootstrap method is used to estimate the confidence level of the composite prediction. In consideration of the small number of cases in each type, we randomly sample the center location from each ENSO type, and each time a new set of samples is obtained. The mean of the center location of this set of samples is calculated and considered to be the composite center position of the new samples. This operation is repeated 1000 times to obtain a distribution of ENSO center positions of this type, and the values of the 5% and 95% quartiles from the distribution are selected as confidence intervals.
-
The integrated gradients method (Sundararajan et al., 2017) is used in this study to attribute the prediction of the CNN model. Given an SSTA input
$ \mathbf{X} $ , the function$ F\left(\mathbf{X}\right) $ representing the CNN model is a highly nonlinear function of$ \mathbf{X} $ . The CNN model$ F\left(\mathbf{X}\right) $ can be approximated with a linear function by using first-order Taylor expansion:$ F\left(\mathbf{X}\right)\approx {\mathbf{\omega }}^{\mathrm{T}}\mathbf{X}+\mathbf{b}. $ Then, the gradients$ \omega $ for each grid point (i, j) in$ \mathbf{X} $ can be calculated asSpecifically, let
$ \mathbf{X} $ be the input for the climatological SST, i.e., zero SSTAs, and$ {\mathbf{X}}^{{{'}}} $ be the input of the initial-month SSTAs. Integrated gradients can be obtained by computing the path integral of the gradients along the straight-line path from the zero SSTAs of the climatology to the SSTAs in the initial month at each grid point. Given a straight-line path function$\gamma \left(\mathit{\alpha }\right)=\mathbf{X}+\mathbf{\alpha } (\mathbf{X}^{{'}}-\mathbf{X})$ , the integrated gradients$ {\mathrm{I}\mathrm{G}}_{i,j} $ for grid point (i, j) are defined as:Here, the integrated gradients for each grid point of the SSTA fields are calculated using the Captum Python package (Kokhlikyan et al., 2020).
Since the third EOF mode is mainly associated with the warming trend in the equatorial Pacific, which shows a slight influence on the prediction skill, the integrated gradients for the coefficients of the two leading EOF modes are calculated and added together as the result to show in the heat map, representing the estimated contribution of the initial SSTAs to the prediction of the CNN model. We scale the result in each grid point by the maximum absolute value, leading to a unitless heat map. Although a previous study illustrated that the integrated gradient method has some limitations to explain the results of the deep learning method (Mamalakis et al., 2022), it can still track down the small-scale, midlatitude signals that appear one year ahead.
-
Considering the respective advantages of the dynamical models and the CNN model in the eastern and central Pacific, we develop a fusion model combining their predictions. The fusion model combines the predictions from the two types of models in each lead month and grid point as follows:
in which
$ {\mathbf{X}}_{l,i,j}^{\mathrm{F}\mathrm{u}\mathrm{s}\mathrm{i}\mathrm{o}\mathrm{n}} $ ,$ {\mathbf{X}}_{l,i,j}^{\mathrm{C}\mathrm{N}\mathrm{N}} $ and$ {\mathbf{X}}_{l,i,j}^{\mathrm{d}\mathrm{y}\mathrm{n}\mathrm{a}\mathrm{m}\mathrm{i}\mathrm{c}} $ denote the prediction of the fusion model, the CNN model and the dynamical models, respectively, for lead month$ l $ and grid point$ (i,j) $ , and$ \beta $ is the fusion weight coefficient. Here,$ \beta $ is calculated asin which
$ {D}_{l,i,j}^{\mathrm{C}\mathrm{N}\mathrm{N}} $ and$ {D}_{l,i,j}^{\mathrm{d}\mathrm{y}\mathrm{n}\mathrm{a}\mathrm{m}\mathrm{i}\mathrm{c}} $ denote the distance between observations and the predictions of the CNN model and the dynamical models, respectively. The distance metrics, Wasserstein distance and RMSE, are used to define two fusion strategies. Wasserstein distance is an efficient method that can be applied to data in any dimensions to rank the models’ performances in simulating the physical field (Vissio et al., 2020).$ \beta $ is calculated based on data from 1984 to 1985 in the CNN model and NMME model, which are not used in the prediction skill evaluation.
Type | Events–DJF | Number |
CP El Niño | 1987/88, 1994/95, 2002/03, 2004/05, 2006/07, 2009/10, 2014/15, 2018/19 | 8 |
EP El Niño | 1986/87, 1991/92, 1997/98, 2015/16 | 4 |
CP La Niña | 1988/89, 1998/99, 2000/01, 2008/09, 2010/11, 2011/12 | 6 |
EP La Niña | 1995/96, 1996/97, 1999/2000, 2005/06, 2007/08, 2017/18 | 6 |