-
In this research, steps were included for sample optimization of ensemble forecasting and are described in this section. Here, a sample indicates one ensemble member.
First, samples were scored and ranked. Each ensemble had 60 samples. After the samples and observations were obtained, the track forecast error and intensity forecast error of each sample were calculated, respectively. The track forecast error is defined as the great-circle distance between the predicted position and the observed position of a TC taken at the same time. The intensity forecast error is the absolute error between the predicted minimum sea level pressure (MSLP) and the observed MSLP. For both track score and intensity score at the same hour, the sample with the smallest error ranked first and had the highest score (60); the sample with the highest error ranked last and had the lowest score (1), and both scores had no units. It should be emphasized that the observation information of TCs provided by the Central Meteorological Station (CMS) is available in real-time and selected as the reference standard for sample quality assessment. This is because the best track information for TCs cannot be obtained in time during an operational forecast.
For each sample, the track score (A) and intensity score (B) were multiplied by the weight proportion
$ w $ and$ 1-w $ and then added as the final score (C) of the sample. This was given by Eq. 1:The values of w were 1.0, 0.9, 0.8, 0.7, 0.6, and 0.5, respectively. Different values of
$ w $ represented different experiments. These experiments were named: “tr”, “91”, “82”, “73”, “64”, “55”, respectively. The experiment “tr” represents adjusting the distribution of samples based on the observed track, noting that for this case w = 1.0, while the other experiments represent sample optimization based on both the observed track and intensity. The parameter w takes on the leading digit of the experiment. For example, w was 0.8 in experiment “82”, and$ w $ was 0.6 in the experiment “64”. Assuming that in experiment “73”, a sample's track error ranks 38 (the track score is 23), the intensity error ranks 42 (the intensity score is 19), and the final score C = 0.7$ \times $ 23 + 0.3$ \times $ 19 = 21.8.Second, following the aforementioned definition of good and bad samples, the bad samples were replaced by the supplemental good samples to reflect the best estimate of the true state of the atmosphere. For the record, disturbance increments exist between the supplemental good samples and the original good ones. This follows a theoretical method referred to by Li et al. (2018b). In these experiments, five TCs were simulated: Haima (201622), Merbok (201702), Hato (201713), Ewiniar (201804), and Maria (201808). The naming system, Maria (201808), refers to Maria (201808) being the eighth TC in 2018. To compare the results of sample optimization, seven experiments (“SOno”, “tr”, “91”, “82”, “73”, “64”, “55”) were performed for each TC included in this research. It should be noted that the ensemble forecast without sample optimization was called “SOno”.
-
All experiments were performed using the Advanced Research version of the Weather and Research Forecasting (WRF-ARW) model (version 3.4.1). The ensemble was initialized using the European Center for Medium-Range Weather Forecast (ECMWF), 0.125° × 0.125° analysis data for the TCs: Haima (201622), Merbok (201702), and Hato (201703). With the improved resolution from 2018, the ECMWF 0.1° × 0.1° analysis data was used for the TCs: Ewiniar (201804) and Maria (201808). The horizontal grid spacing was 3 km and the vertical resolution was 50 layers. The parameterization schemes used in this research were as per the methodology of Li et al. (2018a, b). The grid points of each TC are listed in Table 1.
The experimental process was completed in three steps. The first step was spin-up. The initial 60 ensemble members were generated by adding perturbations, which were randomly sampled from the default “cv3” background error covariance option in the WRF-3DVar package, then integrated for 6 h. Such initial perturbations were characterized by balanced random perturbations and would evolve to become flow-dependent under the constraints of atmospheric dynamics in the NWP model after a several-hour-long integration. Thus, the background error covariance estimated from the 6-h ensemble forecasts can reflect the uncertainties related to TC circulation (especially the inner-core structure) to some extent. However, the 6-h ensemble forecasts are still insufficient to completely estimate the background error covariance for the synoptic-scale environment, which may require a few days of ensemble forecasts to evolve. Thus, future works are essential to integrate ensemble forecasts for longer lead times to estimate the background error covariance. The WRF-based Ensemble Kalman filter (EnKF) system was first developed for regional-scale data assimilation by Meng and Zhang (2008a, b). The control variables, the perturbed variables, the horizontal (vertical) length of the covariance localization, and the covariance relaxation method all followed that of Zhu et al. (2016). The perturbed variables included the horizontal wind components (u, v), potential temperature, and the water vapor mixing ratio, with standard deviations of 2 m s−1 for wind, 1 K for temperature, and 0.5 g kg−1 for mixing ratio. Similar perturbations were also used to represent the boundary condition uncertainties of the ensemble. The covariance relaxation method was used to inflate the background error covariance with a relaxation coefficient of 0.8. The prognostic variables of perturbation potential temperature (T), vertical velocity (W), horizontal wind components (U and V), mixing ratio for water vapor (QVAPOR), cloud water (QCLOUD), rainwater (QRAIN), perturbation geopotential (PH), perturbation dry air mass in column (MU), surface pressure (PSFC), and perturbation pressure (P) were updated. The horizontal length of the covariance localization was set to 30 km, and the vertical length of the covariance localization was set to six layers. Details of this step can be found in Li et al. (2018a, b). The second step was sample optimization. Data assimilation was performed using the WRF-EnKF method with relocation (C0) after the spin-up, where only the observed positioning data were assimilated. Following this, the 6-h cycles of sample optimization were named C1, C2, C3, C4, C5, and C6, respectively (Li et al., 2020b). Because the simulated TCs were far from landing, sample optimization is more important than assimilation according to the previous research (Li et al., 2018a, b, 2020a, b). Table 2 shows the specific operation of the seven schemes. It should be emphasized that in the six experiments with sample optimization, 20 bad samples were replaced by 20 supplemental samples copied from good samples, while another 20 moderate samples (i.e. samples not identified as good or bad) were retained without replacement. It is further pointed out that the perturbations were added to the supplemented samples by an ensemble Kalman filter (EnKF), that is to say, 20 copied samples acquired an assimilation increment (Li et al., 2018b). In the EnKF, α is a constant. HPbHT and R are scalars representing the background and observational-error variance at the observation location, respectively (Whitaker and Hamill, 2002).
TC Spin-up Relocation Optimization Ensemble average End of prediction Horizontal
regions (lon × lat)Haima (201622) 12 UTC 10/18/2016 18 UTC 10/18/2016 19 UTC 10/18/2016 00 UTC 10/19/2016 00 UTC 10/21/2016 889 × 667 Merbok (201702) 12 UTC 06/10/2017 18 UTC 06/10/2017 19 UTC 06/10/2017 00 UTC 06/11/2017 00 UTC 06/13/2017 298 × 667 Hato (201713) 00 UTC 08/21/2017 09 UTC 08/21/2017 10 UTC 08/21/2017 15 UTC 08/21/2017 12 UTC 08/23/2017 814 × 364 Ewiniar (201804) 00 UTC 06/04/2018 06 UTC 06/04/2018 07 UTC 06/04/2018 12 UTC 06/04/2018 12 UTC 06/06/2018 260 × 461 Maria (201808) 00 UTC 07/08/2018 06 UTC 07/08/2018 07 UTC 07/08/2018 12 UTC 07/08/2018 12 UTC 07/10/2018 889 × 482 Table 1. The start times of the spin-up, relocation, optimization, ensemble average, the end times of prediction, and the horizontal grids for each TC.
There was a disturbance difference between the supplementary samples and the good ones, which came from the difference of α. α is not equal to 0 for the good samples, but α is set to 0 for the supplemental ones in this research.
It should be noted that, considering the intended results and the workload of one experiment, the optimal number for sample optimization was designated as 20 members. However, the ensemble maintains the original total number (60 members) at the end of each cycle (Li et al., 2020b). The third step was integration. The initial value (ensemble average of 60 samples) was generated by the final analysis included in the sample optimization step and then forecast for 48 h.
As a reminder, take the 0000 UTC forecast as an example, the ECMWF initial field data is usually obtained six hours later in the actual forecast, and a record by CMS at 0600 UTC already exists, but the record from 0700 to 1200 UTC still needs to wait, so there is a certain forecast lag. If this technique is applied in the actual forecast, the spin-up and sample optimization cycles should be shortened to three hours, which can effectively avoid decreasing the lead time of the forecast.
Table 1 shows the specific timings of the nodes and horizontal regions for each experiment. For instance, “Merbok” (201702) was the sixteenth TC in the year 2017. As can be seen in Table 1, the ensemble forecast for TC Merbok (201702) started at 1200 UTC on 10 June 2017. After 6 h, it was relocated to 1800 UTC on 10 June 2017 and then entered for cycles of sample optimization (from 1900 UTC on 10 June to 0000 UTC on 11 June 2017). The ensemble average forecast was taken at 0000 UTC on 11 June 2017. Finally, the prediction ended at 0000 UTC on 13 June 2017. The other four TCs had similar procedures as Merbok (201702).
scheme Specific operation SOno Only relocated at C0 tr relocated at C0, sample optimization from C1 to C6 only referring to the observed track 91 relocated at C0, sample optimization from C1 to C6 according to the ratio of 9 (the observed track) to 1 (the observed intensity) 82 relocated at C0, sample optimization from C1 to C6 according to the ratio of 8 (the observed track) to 2 (the observed intensity) 73 relocated at C0, sample optimization from C1 to C6 according to the ratio of 7 (the observed track) to 3 (the observed intensity) 64 relocated at C0, sample optimization from C1 to C6 according to the ratio of 6 (the observed track) to 4 (the observed intensity) 55 relocated at C0, sample optimization from C1 to C6 according to the ratio of 5 (the observed track) to 5 (the observed intensity) Table 2. The specific operation of the seven schemes.
It is necessary to mention that there was no ECMWF data at 1200 UTC on 21 August 2017 for TC Hato (201713), so the ensemble prediction started at 0000 UTC on 21 August 2017. The observations could only be obtained from 0900 UTC on 21 August 2017. Therefore, sample optimization was delayed by 3 h.
TC | Spin-up | Relocation | Optimization | Ensemble average | End of prediction | Horizontal regions (lon × lat) |
Haima (201622) | 12 UTC 10/18/2016 | 18 UTC 10/18/2016 | 19 UTC 10/18/2016 | 00 UTC 10/19/2016 | 00 UTC 10/21/2016 | 889 × 667 |
Merbok (201702) | 12 UTC 06/10/2017 | 18 UTC 06/10/2017 | 19 UTC 06/10/2017 | 00 UTC 06/11/2017 | 00 UTC 06/13/2017 | 298 × 667 |
Hato (201713) | 00 UTC 08/21/2017 | 09 UTC 08/21/2017 | 10 UTC 08/21/2017 | 15 UTC 08/21/2017 | 12 UTC 08/23/2017 | 814 × 364 |
Ewiniar (201804) | 00 UTC 06/04/2018 | 06 UTC 06/04/2018 | 07 UTC 06/04/2018 | 12 UTC 06/04/2018 | 12 UTC 06/06/2018 | 260 × 461 |
Maria (201808) | 00 UTC 07/08/2018 | 06 UTC 07/08/2018 | 07 UTC 07/08/2018 | 12 UTC 07/08/2018 | 12 UTC 07/10/2018 | 889 × 482 |