Advanced Search
Article Contents

Data Assimilation Method Based on the Constraints of Confidence Region


doi: 10.1007/s00376-017-7045-y

  • The ensemble Kalman filter (EnKF) is a distinguished data assimilation method that is widely used and studied in various fields including methodology and oceanography. However, due to the limited sample size or imprecise dynamics model, it is usually easy for the forecast error variance to be underestimated, which further leads to the phenomenon of filter divergence. Additionally, the assimilation results of the initial stage are poor if the initial condition settings differ greatly from the true initial state. To address these problems, the variance inflation procedure is usually adopted. In this paper, we propose a new method based on the constraints of a confidence region constructed by the observations, called EnCR, to estimate the inflation parameter of the forecast error variance of the EnKF method. In the new method, the state estimate is more robust to both the inaccurate forecast models and initial condition settings. The new method is compared with other adaptive data assimilation methods in the Lorenz-63 and Lorenz-96 models under various model parameter settings. The simulation results show that the new method performs better than the competing methods.
    摘要: 集合Kalman滤波在数据同化方法中被广泛采用. 然而, 由于样本量个数的限制以及动态模型往往含有偏差, 集合Kalman滤波容易出现预报误差方差低估的问题, 从而进一步导致滤波发散的现象发生. 另外, 当初始状态的设置偏离真实值较远时, 初始时间点附近的估计结果也往往不是很理想. 为了解决上面的问题, 方差膨胀方法被广泛采用. 在这篇文章里, 我们提出了一种基于观测值构造相应的置信区间来估计预报方差膨胀参数的方法(EnCR). 新方法的估计结果在含有不精确的初值设置或者带有偏差的预报模型中均表现较为稳健. 最后, 我们将新方法分别用到 Lorenz-63 与 Lorenz-96 模型中, 从模拟结果我们可以看出, 新方法比以往的方法具有更精确的估计结果.
  • 加载中
  • Anderson J. L., S. L. Anderson, 1999: A Monte Carlo implementation of the nonlinear filtering problem to produce ensemble assimilations and forecasts. Mon. Wea. Rev.,127(11), 2741-2758, .https://doi.org/10.1175/1520-0493(1999)127<2741:AMCIOT>2.0.CO;2
    Burgers G., P. Jan van Leeuwen, and Evensen, G., 1998: Analysis scheme in the ensemble Kalman filter. Mon. Wea. Rev.,126(7), 1719-1724, .https://doi.org/10.1175/1520-0493(1998)126<1719:ASITEK>2.0.CO;2
    Chan Y. T., A. G. C. Hu, and J. B. Plant, 1979: A Kalman filter based tracking scheme with input estimation.IEEE Transactions on Aerospace and Electronic Systems,AES-15, 237-244, .https://doi.org/10.1109/TAES.1979.308710
    Evensen G., 1994: Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics.J. Geophys. Res.: Oceans,99(C5), 10 143-10 162, https://doi.org/10.1029/94JC00572.
    Evensen G., 2009: Data Assimilation: The Ensemble Kalman Filter. 2nd ed. Springer Science & Business Media,307 pp.10.1007/978-3-540-38301-7098ae6b2b079b661d95b10d810c9121bhttp%3A%2F%2Fdl.acm.org%2Fcitation.cfm%3Fid%3D1206873http://dl.acm.org/citation.cfm?id=1206873This book reviews popular data-assimilation methods, such as weak and strong constraint variational methods, ensemble filters and smoothers. The author shows how different methods can be derived from a common theoretical basis, as well as how they differ or are related to each other, and which properties characterize them, using several examples. Readers will appreciate the included introductory material and detailed derivations in the text, and a supplemental web site.
    Gordon N. J., D. J. Salmond, and A. F. M. Smith, 1993: Novel approach to nonlinear/non-Gaussian Bayesian state estimation.IEEE Proceedings F: Radar and Signal Processing,140(3), 107-113, . 0015.https://doi.org/10.1049/ip-f-2.1993
    Houtekamer P. L., H. L. Mitchell, 2001: A sequential ensemble Kalman filter for atmospheric data assimilation. Mon. Wea. Rev.,129(2), 123-137, .https://doi.org/10.1175/1520-0493(2001)129<0123:ASEKFF>2.0.CO;2
    Kalman R. E., 1960: A new approach to linear filtering and prediction problems.Journal of Basic Engineering,82(2), 35-45, .https://doi.org/10.1115/1.3662552
    Li H., E. Kalnay, and T. Miyoshi, 2009: Simultaneous estimation of covariance inflation and observation errors within an ensemble Kalman filter.Quart. J. Roy. Meteor. Soc., 135(639) 523-533, .https://doi.org/10.1002/qj.371
    Linderoth M., K. Soltesz, A. Robertsson, and R. Johansson, 2011: Initialization of the Kalman filter without assumptions on the initial state.Proc. 2011 IEEE International Conference on Robotics and Automation,Shanghai, China, IEEE, 4992-4997, .https://doi.org/10.1109/ICRA.2011.5979684
    Lorenc A. C., 2003: The potential of the ensemble Kalman filter for NWP comparison with 4D-Var. Quart. J. Roy. Meteor. Soc., 129(595) 3183-3203, .https://doi.org/10.1256/qj.02.132
    Lorenz E. N., 1963: Deterministic nonperiodic flow. J. Atmos. Sci.,20(3), 130-141, .https://doi.org/10.1175/1520-0469(1963)020<0130:DNF>2.0.CO;2
    Lorenz E. N., 1996: Predictability problem partly solved. Proc. Seminar on Predictability, ECMWF, Shinfield Park, Reading, Berkshire, UK., ECMWF.
    MacGregor J. F., T. Kourti, 1995: Statistical process control of multivariate processes.Control Engineering Practice,3(4), 403-414, https://doi.org/10.1016/0967-0661(95)00014-L.
    Montgomery D. C., 2007: Introduction to Statistical Quality Control. John Wiley & Sons,734 pp10.2307/2289508a3bcba88f355d38a060304f4af466f9chttp%3A%2F%2Fwww.tandfonline.com%2Fdoi%2Fpdf%2F10.1198%2Ftech.2001.s61http://www.tandfonline.com/doi/pdf/10.1198/tech.2001.s61PAGES (INTRO/BODY): SUBJECT(S): Quality control; Process control; Statistical methods.DISCIPLINE: No discipline assigned. SPONSOR(S): ABSTRACT: "ASQC Quality Press"--Spine.Includes bibliographical references and index. STATISTICS. Click on # to view. Citations, 0.
    Palmer T. N., 1993: Extended-range atmospheric prediction and the Lorenz model. Bull. Amer. Meteor. Soc.,74(2), 49-65, .https://doi.org/10.1175/1520-0477(1993)074<0049:ERAPAT>2.0.CO;2
    Salmond D., N. Gordon, 2005: An introduction to particle filters. State Space and Unobserved Component Models Theory and Applications,1-19
    Sheng Y., Y. Li, 2015: An ensemble forecast method based on observational errors.Journal of Beijing Normal University (Natural Science),2015(2), 9-13, . (in Chinese)https://doi.org/10.16360/j.cnki.jbnuns.2015.01.003
    van Loon, M., P. J. Builtjes, A. J. Segers, 2000: Data assimilation of ozone in the atmospheric transport chemistry model LOTOS.Environmental Modelling & Software,15(7), 603-609, 8152( 00) 00048- 7.https://doi.org/10.1016/S1364-
    Wang X. G., C. H. Bishop, 2003: A comparison of breeding and ensemble transform Kalman filter ensemble forecast schemes. J. Atmos. Sci.,60(9), 1140-1158, .https://doi.org/10.1175/1520-0469(2003)060<1140:ACOBAE>2.0.CO;2
    Wang Y., M. Bellus, J. F. Geleyn, X. L. Ma, W. H. Tian, and F. Weidle, 2014: A new method for generating initial condition perturbations in a regional ensemble prediction system: Blending. Mon. Wea. Rev., 142(6) 2043-2059, .https://doi.org/10.1175/MWR-D-12-00354.1
    Welch G., G. Bishop, 1995: An introduction to the Kalman filter. Tech. Rep., Department of Computer Science, University of North Carolina at Chapel Hill Chapel Hill, NC.,USA.
    Wu G. C., X. G. Zheng, L. Q. Wang, S. P. Zhang, X. Liang, and Y. Li, 2013: A new structure for error covariance matrices and their adaptive estimation in EnKF assimilation. Quart. J. Roy. Meteor. Soc., 139(672) 795-804, .https://doi.org/10.1002/qj.2000
    Zheng, H., Coauthors, 2015: A global carbon assimilation system based on a dual optimization method.Biogeosciences,12(5), 1131-1150, .https://doi.org/10.5194/bg-12-1131-2015
  • [1] ZHENG Xiaogu, WU Guocan, ZHANG Shupeng, LIANG Xiao, DAI Yongjiu, LI Yong, , 2013: Using Analysis State to Construct a Forecast Error Covariance Matrix in Ensemble Kalman Filter Assimilation, ADVANCES IN ATMOSPHERIC SCIENCES, 30, 1303-1312.  doi: 10.1007/s00376-012-2133-5
    [2] Fuqing ZHANG, Meng ZHANG, James A. HANSEN, 2009: Coupling Ensemble Kalman Filter with Four-dimensional Variational Data Assimilation, ADVANCES IN ATMOSPHERIC SCIENCES, 26, 1-8.  doi: 10.1007/s00376-009-0001-8
    [3] Zhaoxia PU, Joshua HACKER, 2009: Ensemble-based Kalman Filters in Strongly Nonlinear Dynamics, ADVANCES IN ATMOSPHERIC SCIENCES, 26, 373-380.  doi: 10.1007/s00376-009-0373-9
    [4] LIU Zhengyu, WU Shu, ZHANG Shaoqing, LIU Yun, RONG Xinyao, , 2013: Ensemble Data Assimilation in a Simple Coupled Climate Model: The Role of Ocean-Atmosphere Interaction, ADVANCES IN ATMOSPHERIC SCIENCES, 30, 1235-1248.  doi: 10.1007/s00376-013-2268-z
    [5] Xinrong WU, Shaoqing ZHANG, Zhengyu LIU, 2016: Implementation of a One-Dimensional Enthalpy Sea-Ice Model in a Simple Pycnocline Prediction Model for Sea-Ice Data Assimilation Studies, ADVANCES IN ATMOSPHERIC SCIENCES, 33, 193-207.  doi: 10.1007/s00376-015-5099-2
    [6] Jian YUE, Zhiyong MENG, Cheng-Ku YU, Lin-Wen CHENG, 2017: Impact of Coastal Radar Observability on the Forecast of the Track and Rainfall of Typhoon Morakot (2009) Using WRF-based Ensemble Kalman Filter Data Assimilation, ADVANCES IN ATMOSPHERIC SCIENCES, 34, 66-78.  doi: 10.1007/s00376-016-6028-8
    [7] Youmin TANG, Jaison AMBANDAN, Dake CHEN, , , 2014: Nonlinear Measurement Function in the Ensemble Kalman Filter, ADVANCES IN ATMOSPHERIC SCIENCES, 31, 551-558.  doi: 10.1007/s00376-013-3117-9
    [8] DING Ruiqiang, LI Jianping, 2012: Relationships between the Limit of Predictability and Initial Error in the Uncoupled and Coupled Lorenz Models, ADVANCES IN ATMOSPHERIC SCIENCES, 29, 1078-1088.  doi: 10.1007/s00376-012-1207-8
    [9] Xiaogu ZHENG, 2009: An Adaptive Estimation of Forecast Error Covariance Parameters for Kalman Filtering Data Assimilation, ADVANCES IN ATMOSPHERIC SCIENCES, 26, 154-160.  doi: 10.1007/s00376-009-0154-5
    [10] Bohua Huang, James L. Kinter III, Paul S. Schopf, 2002: Ocean Data Assimilation Using Intermittent Analyses and Continuous Model Error Correction, ADVANCES IN ATMOSPHERIC SCIENCES, 19, 965-992.  doi: 10.1007/s00376-002-0059-z
    [11] Lili LEI, Yangjinxi GE, Zhe-Min TAN, Yi ZHANG, Kekuan CHU, Xin QIU, Qifeng QIAN, 2022: Evaluation of a Regional Ensemble Data Assimilation System for Typhoon Prediction, ADVANCES IN ATMOSPHERIC SCIENCES, 39, 1816-1832.  doi: 10.1007/s00376-022-1444-4
    [12] FU Weiwei, ZHOU Guangqing, WANG Huijun, 2004: Ocean Data Assimilation with Background Error Covariance Derived from OGCM Outputs, ADVANCES IN ATMOSPHERIC SCIENCES, 21, 181-192.  doi: 10.1007/BF02915704
    [13] LIU Juanjuan, WANG Bin, WANG Shudong, 2010: The Structure of Background-error Covariance in a Four-dimensional Variational Data Assimilation System: Single-point Experiment, ADVANCES IN ATMOSPHERIC SCIENCES, 27, 1303-1310.  doi: 10.1007/s00376-010-9067-6
    [14] ZHANG Shuwen, LI Haorui, ZHANG Weidong, QIU Chongjian, LI Xin, 2005: Estimating the Soil Moisture Profile by Assimilating Near-Surface Observations with the Ensemble Kalman Filter (EnKF), ADVANCES IN ATMOSPHERIC SCIENCES, 22, 936-945.  doi: 10.1007/BF02918692
    [15] Chaoqun MA, Tijian WANG, Zengliang ZANG, Zhijin LI, 2018: Comparisons of Three-Dimensional Variational Data Assimilation and Model Output Statistics in Improving Atmospheric Chemistry Forecasts, ADVANCES IN ATMOSPHERIC SCIENCES, 35, 813-825.  doi: 10.1007/s00376-017-7179-y
    [16] BAI Yulong, LI Xin, and HUANG Chunlin, 2013: Handling error propagation in sequential data assimilation using an evolutionary strategy, ADVANCES IN ATMOSPHERIC SCIENCES, 30, 1096-1105.  doi: 10.1007/s00376-012-2115-7
    [17] Banglin ZHANG, Vijay TALLAPRAGADA, Fuzhong WENG, Jason SIPPEL, Zaizhong MA, 2016: Estimation and Correction of Model Bias in the NASA/GMAO GEOS5 Data Assimilation System: Sequential Implementation, ADVANCES IN ATMOSPHERIC SCIENCES, 33, 659-672.  doi: 10.1007/ s00376-015-5155-y
    [18] Qizhen SUN, Timo VIHMA, Marius O. JONASSEN, Zhanhai ZHANG, 2020: Impact of Assimilation of Radiosonde and UAV Observations from the Southern Ocean in the Polar WRF Model, ADVANCES IN ATMOSPHERIC SCIENCES, 37, 441-454.  doi: 10.1007/s00376-020-9213-8
    [19] GAO Feng*, Peter P. CHILDS, Xiang-Yu HUANG, Neil A. JACOBS, and Jinzhong MIN, 2014: A Relocation-based Initialization Scheme to Improve Track-forecasting of Tropical Cyclones, ADVANCES IN ATMOSPHERIC SCIENCES, 31, 27-36.  doi: 10.1007/s00376-013-2254-5
    [20] Zhiyong MENG, Eugene E. CLOTHIAUX, 2022: Contributions of Fuqing ZHANG to Predictability, Data Assimilation, and Dynamics of High Impact Weather: A Tribute, ADVANCES IN ATMOSPHERIC SCIENCES, 39, 676-683.  doi: 10.1007/s00376-021-1362-x

Get Citation+

Export:  

Share Article

Manuscript History

Manuscript received: 05 March 2017
Manuscript revised: 06 June 2017
Manuscript accepted: 11 July 2017
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Data Assimilation Method Based on the Constraints of Confidence Region

  • 1. School of Statistics, Beijing Normal University, Beijing 100875, China
  • 2. School of Mathematical Sciences, Beijing Normal University, Beijing 100875, China

Abstract: The ensemble Kalman filter (EnKF) is a distinguished data assimilation method that is widely used and studied in various fields including methodology and oceanography. However, due to the limited sample size or imprecise dynamics model, it is usually easy for the forecast error variance to be underestimated, which further leads to the phenomenon of filter divergence. Additionally, the assimilation results of the initial stage are poor if the initial condition settings differ greatly from the true initial state. To address these problems, the variance inflation procedure is usually adopted. In this paper, we propose a new method based on the constraints of a confidence region constructed by the observations, called EnCR, to estimate the inflation parameter of the forecast error variance of the EnKF method. In the new method, the state estimate is more robust to both the inaccurate forecast models and initial condition settings. The new method is compared with other adaptive data assimilation methods in the Lorenz-63 and Lorenz-96 models under various model parameter settings. The simulation results show that the new method performs better than the competing methods.

摘要: 集合Kalman滤波在数据同化方法中被广泛采用. 然而, 由于样本量个数的限制以及动态模型往往含有偏差, 集合Kalman滤波容易出现预报误差方差低估的问题, 从而进一步导致滤波发散的现象发生. 另外, 当初始状态的设置偏离真实值较远时, 初始时间点附近的估计结果也往往不是很理想. 为了解决上面的问题, 方差膨胀方法被广泛采用. 在这篇文章里, 我们提出了一种基于观测值构造相应的置信区间来估计预报方差膨胀参数的方法(EnCR). 新方法的估计结果在含有不精确的初值设置或者带有偏差的预报模型中均表现较为稳健. 最后, 我们将新方法分别用到 Lorenz-63 与 Lorenz-96 模型中, 从模拟结果我们可以看出, 新方法比以往的方法具有更精确的估计结果.

1. Introduction
  • The Kalman filter (Kalman, 1960) is a procedure that aims to obtain an optimal estimate of state based on the model evolution and observation information, and it is widely used in various fields, such as tracking, robot localization and satellite navigation (Chan et al., 1979; Welch and Bishop, 1995; Linderoth et al., 2011). However, the Kalman filter is limited to the linear Gaussian assumption and cannot be executed when the model error variance is unknown. To address these problems, (Evensen, 1994) and (Burgers et al., 1998) proposed the ensemble Kalman filter (EnKF), which uses ensemble members to represent the distribution of the state. The EnKF method is widely used in various fields including, but not limited to, oil reservoir simulation, carbon assimilation, meteorology and oceanography (van Loon et al., 2000; Houtekamer and Mitchell, 2001; Lorenc, 2003; Evensen, 2009; Zheng et al., 2015).

    However, due to the limited sample size and imprecise dynamics model evolution function, it may be easy for the forecast variance to be underestimated (Anderson and Anderson, 1999; Evensen, 2009). Therefore, over time, the observations may have a small impact on the estimation process, and further lead to the phenomenon of filter divergence.

    Therefore, dealing with the problem of forecast error variance underestimation becomes a necessary way to prevent filter divergence in this case, and the variance inflation technique is an effective procedure to address this problem.

    In early research, the variance inflation parameter was determined by repeated experiments or prior knowledge (Anderson and Anderson, 1999). (Wang and Bishop, 2003) proposed an online estimation of the inflation parameter by a sequence of innovation statistics. Based on their work, (Li et al., 2009) further developed an algorithm that can simultaneously estimate the inflation parameter and observation errors. (Wu et al., 2013) proposed a method to estimate the inflation parameter with a second-order least-squares method. Although these methods can effectively address the phenomenon of forecast variance underestimation, sometimes the distance between the actual observation and its estimate given the analysis state of the above methods may be too large. In this situation, the estimate of the state cannot be regarded as a reasonable output of the corresponding assimilation method (Anderson and Anderson, 1999). Besides, the true initial condition is usually unknown in most practical applications, and a data assimilation method is frequently initialized by guessing or prior experience, and thus the estimation process may take a long time to become stable if the initial state is settled far away from the true value. Hence, reducing this time is also very important in practical applications, particularly when the estimate is needed as soon as possible (Linderoth et al., 2011; Wang et al., 2014).

    To relieve the phenomenon of filter degeneracy and reduce the convergence time, here we judge the output of the data assimilation method in real time based on a confidence region constructed with the observation information. This idea is quite similar to the quality control method. Quality control is an online procedure to monitor and control an ongoing production process (MacGregor and Kourti, 1995; Montgomery, 2007). If the outputs of the process fall outside a pre-specified limit, the process is regarded as out of control and an adjustment measure should be taken. Based on this idea, we propose a method using the idea of quality control to address the issues of EnKF stated above, and we call this method EnCR, in which "CR" stands for "Confidence Region". In the new method, the filter process is regarded as a production process and should be adjusted when the predicted observation given the analysis value falls outside of the pre-specified limits around the actual observation. This is the basic idea of the EnCR method. In this way, the predicted observation given the analysis state will not differ too much from the actual observation, and thus the estimate (output) will be more reasonable.

    The rest of this paper is organized as follows: In section 2, we summarize the popular EnKF method and give a brief description of the phenomenon of forecast error variance underestimation. In section 3, we propose a new method to address the problem of inflation parameter estimation in the EnKF method. Various simulation results in the Lorenz-63 and Lorenz-96 models are presented in section 4. A summary of the new method is detailed in section 5. The technical proofs are given in the Appendix.

2. EnKF
  • In this section, the basic procedure of the EnKF and phenomenon of forecast error variance underestimation are presented.

    Consider a nonlinear discrete time state space model below: \begin{equation} \label{eq1} \left\{ \begin{array}{rcl} {x}_t&=&{M}_t({x}_{t-1})+{\eta}_t ,\\ {y}_t&=&{H}_t{x}_t+{\varepsilon}_t , \end{array} \right. \ \ (1)\end{equation} where t is the time index, \(x_t\in\Re^r\) is an r-dimensional state vector at time t, Mt(·) is a known nonlinear forecast operator at time t, and ηt is a white, Gaussian distribution with mean 0 and variance Qt and is often referred to as the model error. Moreover, \(y_t\in\Re^n\) is an n-dimensional observation vector at time t, Ht is a known linear observation operator at time t, and εt is a white, Gaussian distribution with mean 0 and variance Rt and is often referred to as the observation error. The error terms are assumed to be statistically independent and time-uncorrelated.

    When the forecast error operator Mt(·) is linear and Qt is known, model (2) can be solved by the classical Kalman filter. Furthermore, when Mt(·) is nonlinear and Qt is known, the extend Kalman filter, EnKF or the particle filter (PF) (Gordon et al., 1993; Salmond and Gordon, 2005) can be used to solve this model. However, when Mt(·) is nonlinear and Qt is unknown, model (2) can be solved by the EnKF or PF, in which the forecast ensemble is obtained by the evolution of dynamic model without involving the error term. Meanwhile, the corresponding ensemble can be used to estimate the forecast variance. Before proceeding with the EnKF procedure, we denote the ensemble at time t as {xt,i,i=1,2,…,m}, where m represents the sample size. In this paper, we denote the forecast ensemble with subscript f and the analysis ensemble with subscript a. Thus, the EnKF procedure can be stated as follows:

    First, the initial distribution is usually assumed to be known and generated from a Gaussian distribution: $$ {x}_{0,i,{\rm a}}\sim N(\mu_{0},Q_0),1\le i\le m , $$ where μ0 and Q0 are assumed to be known and usually specified by experience.

    The forecast ensemble and its error variance can be estimated by \begin{eqnarray} \label{eq2} {x}_{t,i,{\rm f}}&=&{M}_t({x}_{t-1,i,{\rm a}}),{x}_{t,{\rm f}}=\frac{1}{m}\sum_{i=1}^m{x}_{t,i,{\rm f}} ,\ \ (2)\\ \label{eq3} \hat{P}_{t\vert t-1}&=&\frac{1}{m-1}\sum_{i=1}^m({x}_{t,i,{\rm f}}-{x}_{t,{\rm f}})({x}_{t,i,{\rm f}}-{x}_{t,{\rm f}})' .\ \ (3) \end{eqnarray}

    Then, the analysis ensemble can be obtained by \begin{eqnarray} \label{eq4} {x}_{t,i,{\rm a}}&=&{x}_{t,i,{\rm f}}+\hat{P}_{t\vert t-1}{H}'_t({H}_t\hat{P}_{t\vert t-1}{H}'_t+R_t)^{-1}({y}_t+{\gamma}_{t,i}-{H}_t{x}_{t,i,{\rm f}}) ,\nonumber\\ &&1\le i\le m ,\ \ (4) \end{eqnarray} where γt,i∼ N(0,Rt),i=1,2,… m. Here, yt+γt,i can be viewed as a perturbed observation of the observation at time t. Therefore, the analysis state can be estimated by \begin{equation} \label{eq5} {x}_{t,{\rm a}}=\frac{1}{m}\sum_{i=1}^m{x}_{t,i,{\rm a}} . \ \ (5)\end{equation}

    From the procedure described above, we can ascertain that the estimated forecast error variance \(\hat{P}_t\vert t-1\) in Eq. (5) is underestimated because the true forecast error variance can be represented by \begin{eqnarray} \label{eq6} P_{t\vert t-1}&=&E(({x}_t-{x}_{t,i,{\rm f}})({x}_t-{x}_{t,i,{\rm f}})')\nonumber\\ &=&\mathop{\lim}\limits_{m\to\infty}(\hat{P}_{t\vert t-1}+E(({x}_t-{x}_{t,{\rm f}})({x}_t-{x}_{t,{\rm f}})'))\nonumber\\ &=&\mathop{\lim}\limits_{m\to\infty}(\hat{P}_{t\vert t-1})+Q_t . \ \ (6)\end{eqnarray}

    Hence, over time, the forecast error variance may be significantly underestimated and the phenomenon of filter degeneracy may occur.

    Moreover, even though the model error Qt is known, the problem of forecast error variance underestimation may also exist because of the existence of spurious correlations over long spatial distances or between variables known to be uncorrelated (Evensen, 2009). Hence, the filter may also become degenerated over time, as depicted in the simulation example of the perfect model case in subsection 4.2.2 or the explanations of Evensen (2009, Chapter 15) about this issue.

    Therefore, the forecast error variance is typically multiplied by a parameter that is bigger than one to relieve the problem of forecast error variance underestimation; that is, \begin{equation} P_{t\vert t-1}=\lambda_t\hat{P}_{t\vert t-1} ,\ \ (7) \end{equation} where Λt is the adjustment (inflation) parameter. Then, the inflated forecast error variance \(\lambda_t\hat{P}_t\vert t-1\) is used to replace the original \(\hat{P}_t\vert t-1\) in Eq. (5). The adjustment parameter can be estimated by many methods, e.g., the W-B (Wang and Bishop, 2003) or the SLS [Second-order Least Squares, (Wu et al., 2013)] methods. However, these methods do not consider whether the forecast observation given the analysis state differs too much from the newest observation. Next, we present the estimation of Λt with the constraints of confidence region.

3. Inflation parameter estimation by the EnCR method
  • One of the main tasks of data assimilation is to obtain an online estimate of state with the newest observation information arriving. At time t, the conditional probability density distribution (PDF) of state given all the observations until time t in model (2) can be expressed as \begin{equation} \label{eq7} P({x}_t\vert Y_t)=\frac{P({y}_t\vert{x}_t)P({x}_t\vert Y_{t-1})}{P({y}_t\vert Y_{t-1})} ,\ \ (8) \end{equation} where Yt={y1,y2,…,yt}. For more details about the deduction of Eq. (8), please refer to (Gordon et al., 1993) or (Anderson and Anderson, 1999).

    From Eq. (8), we can see that if the relative probability of yt given xt is too small——that is, the prediction observation given the analysis state diverges too far from the actual observation——the final conditional PDF of xt will also be very small. This means that xt is unlikely to be the true state (Anderson and Anderson, 1999)——that is, the analysis state xt, a cannot be regarded as a reasonable estimate if the value of \(p(y_t\vert x_t,\rm a)\) is too small. Therefore, we can evaluate the outcome using this perspective. From this point of view, we construct a confidence region based on the observation to calculate the inflation parameter in the EnKF.

  • Based on the descriptions above, we introduce a confidence region to testify the feasibility of the analysis state. This idea is similar to the quality control method. Quality control is a method that aims at monitoring the extent to which products meet specifications. By monitoring the quality characteristics, the abnormality of a production procedure can be detected quickly and a measure can be taken in time to guarantee that the process is under control (MacGregor and Kourti, 1995; Montgomery, 2007). Statistical principles are widely used in quality control and, here, we briefly introduce its procedure based on the χ2 statistic.

    Suppose the quality characteristic is x, \(\hat\Sigma\) is the corresponding estimated in-control covariance, and μ is a pre-specified value. The control region can be stated as $$ D_L=\{{x}:({x}-{\mu})'\Sigma^{-1}({x}-{\mu})<\chi_\alpha^2\} , $$ where χα is the α quantile of the χ2 distribution. If x falls in the region DL, the production process is considered to be normal; otherwise, the process is regarded as out of control and some adjustment procedure needs to be taken. Usually, α is set as 0.99; that is $$ P(({x}-{\mu})'\Sigma^{-1}({x}-{\mu})\le L)=0.99 . $$

    Similarly, we can view the EnKF method with an unknown adjustment parameter as a production process, and the analysis state can be viewed as the corresponding product of the process. Hence, we can determine the parameter based on the idea of quality control and construct the corresponding control region as follows: First, the product (analysis state) xt, a can be expressed as \begin{equation} \label{eq8} {x}_{t,{\rm a}}(\lambda_t)={x}_{t,{\rm f}}+\lambda_t\hat{P}_{t\vert t-1}{H}'_t(\lambda_t{H}_t\hat{P}_{t\vert t-1}{H}'_t+R_t)^{-1} ({y}_t-{H}_t{x}_{t,{\rm f}}) , \ \ (9)\end{equation} which is controlled by parameter Λt, and xt, f represents the prior forecast of the state. The observation can be viewed as the pre-specified value, and the control statistic can be constructed by \begin{equation} \label{eq9} u_t(\lambda_t)=({y}_t-{H}_t{x}_{t,{\rm a}}(\lambda_t))'V_t^{-1}({y}_t-{H}_t{x}_{t,{\rm a}} (\lambda_t)) , \ \ (10)\end{equation} where Vt is the in-control covariance matrix that can be calculated by $$ V_t=E(({y}_t-{H}_t{x}_{t,{\rm a}}(\lambda_t))({y}_t-{H}_t{x}_{t,{\rm a}}(\lambda_t))') . $$ The parameter Λt needs to be adjusted when xt, a falls outside the control region (confidence region): \begin{equation} D=\{\lambda_t:u_t(\lambda_t)<L\} , \ \ (10)\end{equation} where L is the α quantile of the χ2(n) distribution and is usually set between 0.90 and 0.99, and n is the dimension of yt.

    Theorem 3.1: Under assumption (7), we have \begin{equation} \label{eq10} V_t=R_t(\lambda_t{H}_t\hat{P}_{t|{t-1}}{H}'_t+R_t)^{-1}R_t \ \ (12)\end{equation}

    Theorem 3.2: Under assumption (7), the analysis state variance of EnCR can be expressed as \begin{equation} \label{eq11} E(({x}_t-{x}_{t,i,{\rm a}})({x}_t-{x}_{t,i,{\rm a}})')=(\lambda_t^{-1}\hat{P}_{t|{t-1}}^{-1}+{H}'_t R_t^{-1}{H}_t)^{-1} . \ \ (13)\end{equation}

    Theorem 3.3: The control statistic can be expressed as \begin{equation} \label{eq12} u_t(\lambda_t)=({y}_t-{H}_t{x}_{t,{\rm f}})'(\lambda_t{H}\hat{P}_{t\vert t-1}{H}'+R)^{-1} ({y}_t-{H}_t{x}_{t,{\rm f}}) , \ \ (14)\end{equation}

    Based on the results of theorem 3.3, we can ascertain that Λt is a monotonically decreasing function of utt); therefore, the adjustment parameter can be constructed by Eqs. (11) and (13). That is, if the initial value \(\lambda_t\notin D\), we increase Λt until it falls in the region D. In this paper, the parameter Λt that first falls in region D is used as its estimate. We also note that the calculation of Λt in Eq. (13) only involves the dimension of the observation, which is time-saving, especially in some high-dimensional situations, if the dimension of the observation is much smaller than the state.

    Next, the algorithm procedure of the EnCR is presented.

  • Based on the descriptions stated above, the EnCR procedure can be expressed as follows:

    (1) Set the initial values x0 and Q0; and define L to be the α quantile of χ2(n). Here, we set α=0.99 in the simulation examples.

    (2) Generate the initial ensemble x0,1, a,x0,2, a,…,x0,m, a from N(u0,Q0) and set t=1.

    (3) Set the initial parameter Λt=1 and calculate the forecast state and its variance by \begin{eqnarray*} {x}_{t,i,{\rm f}}&=&{M}_t({x}_{t-1,i,{\rm a}}) ,{x}_{t,{\rm f}}=\frac{1}{m}\sum_{i=1}^m{x}_{t,i,{\rm f}} ,\\ \hat{P}_{t|{t-1}}&=&\frac{1}{m-1}\sum_{i=1}^m({x}_{t,i,{\rm f}}-{x}_{t,{\rm f}})({x}_{t,i,{\rm f}}-{x}_{t,{\rm f}})' . \end{eqnarray*}

    (4) If the inequality below holds, go to step (6); otherwise, go to step (7): $$ ({y}_t-{H}_t{x}_{t,{\rm f}})'(\lambda_t{H}\hat{P}_{t\vert t-1}{H}'+R)^{-1}({y}_t-{H}_t{x}_{t,{\rm f}})\ge L , $$

    (5) If Λt<K, we increase the value until it enters the feasible region of step (5), where K represents the upper bound, and we set it to 100 in the simulations; then go to step (5).

    (6) Generate the observation samples γ12,…,γm from N(0,Rt) and evaluate the analysis ensemble, analysis state and its variance by \begin{eqnarray*} {x}_{t,i,{\rm a}}&=&{x}_{t,{\rm f}}+\lambda_t\hat{P}_{t|{t-1}}{H}'_t(\lambda_t{H}_t\hat{P}_{t|{t-1}}{H}'_t+R_t)^{-1} ({y}_t+{\gamma}_i-{H}_t{x}_{t,i,{\rm f}}) ,\\ && 1\le i\le m ,\\ {x}_{t,{\rm a}}&=&\frac{1}{m}\sum_{i=1}^m{x}_{t,i,{\rm a}} .\\ P_{t,{\rm a}}&=&(\lambda_t^{-1}\hat{P}_{t|{t-1}}^{-1}+{H}'_t R_t^{-1}{H}_t)^{-1} . \end{eqnarray*}

    (7) If t reaches the last time point, stop the algorithm; otherwise, go to step (4).

    Then, the analysis state of all the observation time can be obtained.

4. Experiments on Lorenz models
  • In this section, we evaluate the finite sample performance of the EnCR method with the Lorenz-63 and Lorenz-96 models. Both these models have chaotic behavior and are very sensitive to the initial condition settings. In these experiments, some of the model settings are adapted from the work of (Evensen, 2009) and (Wu et al., 2013). To illustrate the performance of the EnCR method, we compare the assimilation results of EnCR with the EnKF, W-B (Wang and Bishop, 2003) and SLS methods (Wu et al., 2013).

  • The Lorenz-63 model is a set of nonlinear differential equations with three variables (Lorenz, 1963). The solution of this model has chaotic behavior and is very sensitive to the initial condition settings. Moreover, this model has been examined by various data assimilation methods for their potential applications with other strongly nonlinear and chaotic models, such as oceanic and atmospheric models (Palmer, 1993; Anderson and Anderson, 1999; Evensen, 2009; Sheng and Li, 2015).

    4.1.1 Forecast model

    First, the Lorenz-63 model can be expressed as \begin{equation} M(x)=\left\{ \begin{array}{l} \dfrac{dx_1}{dt}=ax_2-ax_1\\ \dfrac{dx_2}{dt}=(b-x_3)x_1-x_2\\ \dfrac{dx_3}{dt}=x_1x_2-cx_3 \end{array} \right., \ \ (15)\end{equation} where x=(x1,x2,x3) is the state and a,b,c is the model parameter. Here, we adopt the parameter settings of (Evensen, 2009) and set a=10, b=28, c=8/3.

    We use a forecast model with a model error added on model (15): \begin{equation} \label{eq13} {x}_t={M}({x}_{t-1})+{\eta}_t , \ \ (16)\end{equation} where {ηt} is a sequence of independent and identical random variables that obey a normal distribution of N(0,Qt).

    4.1.2. Observation model

    Here, we consider an observation operator whose dimension is less than three: \begin{equation} \label{eq14} {y}_t={H}_t{x}_t+{\varepsilon}_t , \ \ (17)\end{equation} where {εt} is a sequence of independent and identical random variables that obey a normal distribution of N(0,Rt). The observation matrix H is set as $$ H_t=\left( \begin{array}{c@{\quad}c@{\quad}c} 1 & 2 & 3\\ 1 & 1 & 1 \end{array} \right) , $$ which is independent of time.

    4.1.3. Description of the setup

    In this simulation experiment, we set the time step as 0.05 and solve the Lorenz model (15) with a fourth-order Runge-Kutta integration scheme, and the observations are obtained every four time steps. The initial condition is given by x0=(1,2,3)', and the true state series are generated by Eq. (14) and denoted by x0,x0.2,x0.4,…,x30.

    The observations are generated by Eq. (17) and denoted by y0.2,y0.4,…,y30. We repeat this procedure 200 times and at each time we can obtain an estimate sequence. Hence, the root-mean-square error (RMSE) of the state in the kth dimension of state at time t can be calculated by \begin{equation} \label{eq15} {\rm RMSE}_k(t)=\sqrt{\frac{1}{n}\sum_{i=1}^n(x_{k,t}-x_{k,t,i,{\rm f}})^2} , \ \ (18)\end{equation} where xk,t is the kth dimension of state at time t; and xk,t,i, f is its ith forecast ensample estimate, where i=1,…,200. For simplicity, we denote RMSEk(t) simply as "RMSE". It is obvious that a smaller RMSE usually indicates a better performance of the corresponding assimilation method. Moreover, we also use the time-averaged RMSE value as a criterion to compare the performance of different assimilation methods.

    4.1.4. Initial condition settings

    Since in practical situations the true initial distribution is usually not available, we set an inaccurate initial condition for all the assimilation methods in the simulation experiments to study the robustness of the new method. That is, in the numerical experiments of subsection 4.1.5 and subsection 4.1.6, the mean value of the initial distribution used to estimate the state diverges a distance of 10 from the true initial value used to generate the state. The initial distribution is set with a normal distribution with mean (11,12,13) and variance 0.52I3. The ensemble size is 30 in all cases. Additionally, in subsection 4.1.7, the influence of different initial condition settings is further studied via numerical experiments.

    4.1.5. Assimilation results of one case

    First, we compare the performance of EnCR with the W-B and SLS methods in a situation when the model error variance Qt=0.012I3 and the observation error variance Rt=I2.

    Figure 1.  RMSE of the W-B, SLS and EnCR methods.

    Figure 1 displays the results of the RMSE of the three dimensions of the state based on the W-B, SLS and EnCR methods, separately. From this figure, we can see that the convergence rate of the EnCR method is slightly faster than that of the other two methods, and the RMSE values of the SLS and EnCR methods are clearly smaller than that of the W-B method. To further compare the performances of the EnCR and SLS methods, we remove the RMSE line of the W-B method and obtain Fig. 2. From this figure, it is clear that the EnCR method obtains the smallest RMSE overall.

    Figure 2.  RMSE of the SLS and EnCR methods.

    Moreover, Table 1 demonstrates the time-averaged RMSE values of the EnKF, W-B, SLS and EnCR methods. This table indicates that the W-B, SLS and EnCR methods significantly reduce the time-averaged RMSE of the EnKF method, and the EnCR method achieves the smallest time-averaged RMSE of all the methods. Furthermore, to study the implications of different variance settings, the four methods are compared under various variance settings in the next subsection.

    4.1.6. Implications of the model and observation error variance settings

    In this subsection, we study the estimated accuracy and stability of the EnCR method under different error variance settings.

    First, the observation error variance is set as Rt=0.12I3, and the value of model error variance is set as Qt12I3, where σ1 is a varying parameter. The results are demonstrated in Table 2. From this table, we can see that as σ1 increases, the estimated precision of all the methods decreases, which is consistent with our intuition. When Qt≤ 0.12I2, the time-averaged RMSEs of the SLS and EnCR methods are quite close; and when Qt>0.12I2, the EnCR method obtains the smallest time-averaged RMSE. Moreover, when the model error variance is large, i.e. the last several columns of Table 2, the time-averaged RMSE of the W-B, SLS and EnCR methods are much smaller than that of the EnKF method, and the EnCR method maintains the smallest time-averaged RMSE value, which suggests that the estimated accuracy of the EnCR method is better than that of the other methods.

    Using the information provided in Table 3, we study the influence of the observation error variance when the model error variance is fixed. That is, we set the model error variance Qt=0.12I3 and the observation error variance Rt22I3 with different σ2 values. From this table, we can see that, overall, the estimated precision of the EnCR method is better than that of the other methods. Moreover, it is also notable that, even when the observation variance is relatively large, the new method performs better than the other three methods. Overall, Table 3 shows that the performance of the new method is significantly superior to the other three methods.

    Based on the above results, the estimation results of the EnCR method are superior to the other three methods, particularly when the model error variance is large. Furthermore, as the error variance increases, the estimation precision of all the methods decreases. In short, the simulation results indicate that the new method has a higher estimation precision and is more stable than the other methods.

    4.1.7. Implications of the initial condition settings

    In this section, the implications of the initial condition settings are examined for different assimilation methods. Here, we set the model error variance Qt to 0.012I3 and the observation error variance to Rt=I2. For description simplicity, the variance of the initial condition is set to 0.5I3, and we set the same deviation for the simulation initial value of the three state components to the true initial value and denote it by Diff——that is, \(\rm Diff=\hat{x}_0,i-x_0,i,i=1,2,3\), where \(\hat{x}_0,i\) and x0,i are the simulation initial state and the true state of x0 in the ith component, respectively.

    Table 4 gives the estimation results of all the methods under different deviation Diff settings. The time-averaged RMSE of the EnKF method increases significantly as the deviation increases. However, the time-averaged RMSE of the W-B, SLS and EnCR methods grows very slowly as the bias of the model initial condition increases, which indicates that the three methods are not extremely sensitive to the initial condition settings. The estimated precision of the EnCR method is evidently superior to the other methods in this experiment.

    It is also interesting to study the estimated inflation values of the three methods with different initial condition settings. As we know, the initial state is biased and the variance inflation technique should be adopted in the first several steps. Here, we consider the mean value of the inflation parameter estimation in the first six steps. Table 5 gives the mean value of the inflation parameter with an increase in the bias of the initial state settings. From this table, we can see that the estimation of the inflation parameter of the new method is more reasonable than the other two methods. That is, when the deviation of the initial state is zero, a smaller inflation should be adopted, and with an increase in the deviation, the inflation parameter should also increase. The inflation estimation of the W-B method is always very large, and the variation of the SLS method is not as reasonable as the new method.

  • The Lorenz-96 model is a 40-dimension differential equations model, and it is widely used in various data assimilation studies. In this section, we use this model to study the performance of the EnCR method.

    4.2.1. Model description and parameter settings

    The Lorenz-96 model can be expressed by \begin{equation} \dfrac{dX_k}{dt}=(X_{k+1}-X_{k+2})X_{k-1}-X_k+F , \ \ (19)\end{equation} where k=1,2,…,K(K=40) and the boundary conditions are assumed to be cyclic; that is, X-1=XK-1, X0=XK, XK+1=X1. Here, F is the external forcing term and we set it to F=8 to generate the true state values in this experiment. The solution of the Lorenz-96 model has chaotic behavior and mimics the temporal evolution of a scalar meteorological quantity on a circle of latitude, and the three terms of the right-hand side of Eq. (19) can be viewed as an advection-like term, a damping term and an external forcing term, respectively (Wu et al., 2013). Besides, similar to the Lorenz63 model, we use the fourth-order Runge-Kutta time integration scheme to solve the state model (19), and set a time step of 0.05 non-dimensional units to drive the true state. Besides, assuming the characteristic time scale of the dissipation in the atmosphere is five days, the time step here is roughly equivalent to six hours in real time (Lorenz, 1996).

    Figure 3.  A-RMSE value of the EnKF, W-B, SLS and EnCR methods when the sample size is 20.

    Figure 4.  A-RMSE value of the EnKF, W-B, SLS and EnCR methods when the sample size is 80.

    In this model, we adopt the parameter settings of (Wu et al., 2013) and set X0,k=F, \(k\neq 20\), X0.20=1.01F, with the time step set as 0.2. To achieve stationary estimation results, we obtain the observations every four time steps over a duration of 100 000.

    In this study, the observations are generated by an identical observation operator with a Gaussian distribution error: \begin{equation} y_t=x_t+\varepsilon_t,\varepsilon_t\sim N(0,R_t), \ \ (20)\end{equation} where the observation error variance Rt is spatially correlated with \begin{equation} \label{eq16} R_t(i,j)=\sigma_0^20.5^{\min\{|i-j|,40-|i-j|\}} , \ \ (21)\end{equation} in which we set σ0=1.

    In this study, we use the average state estimate error to compare the performance of different methods, and denote it as A-RMSE: \begin{equation} \label{eq17} \mbox{A-RMSE}(t)=\sqrt{\frac{1}{K}\sum_{k=1}^K(\hat{x}_{k,t,{\rm f}}-x_{k,t})^2} , \ \ (22)\end{equation} where K is the dimension of state, xk,t is the true state of the kth dimension at time t, and \(\hat{x}_k,t,\rm f\) is the corresponding forecast ensemble estimate. Similar to the RMSE criterion, a smaller A-RMSE usually indicates a better performance of the corresponding assimilation method.

    4.2.2. Results when F is correctly specified

    First, we consider a situation when the forcing term F is correctly specified——that is, there has no model error when estimating the state. Here, we use a normal distribution to generate the initial ensemble: \begin{equation} x_{0,j}=x_0+\gamma_j,\gamma_j\sim N(0.05^2I_{40}),1\leq j\leq 30. \ \ (23)\end{equation}

    Figure 3 shows the results of the A-RMSE value of the four methods when the sample size is 20. To make the results clearer, only the A-RMSE results of the first 100 s (500 steps) are displayed in the figure (the trend of A-RMSE after 100 s is similar to the trend around the 100 s). When the sample size is small, the EnCR method obtains the best estimation results, followed by the W-B method. However, the results of the SLS and EnKF methods to a certain degree show up the phenomenon of filter degeneracy in this case.

    Figure 4 presents the results of the A-RMSE value when the sample size is 80. In this case, the EnKF method relieves the phenomenon of filter degeneracy in the initial stage. However, over time, the EnKF method begins to degenerate at approximately 15 s. Overall, the other three methods prevent the filter degeneration phenomenon quite well, and the estimated accuracy of the SLS and EnCR methods is slightly better than that of the W-B method.

    Additionally, Table 6 provides the results of the time-averaged A-RMSE over 100 000 time steps for all the methods under different sample size settings. Overall, the A-RMSE of the EnCR method is smaller and more stable than that of the other methods. Moreover, the estimation results of the EnKF method are the poorest, which is coincident with our intuition.

    4.2.3. Results when F is incorrectly specified

    Since the true model is usually not available, a model used in practical situations is always an approximation of the real one. Therefore, studying the estimation performance of a data assimilation method in the situation when the model is incorrectly specified is meaningful. Here, we also use the A-RMSE criterion to examine the estimation robustness of different models.

    We also adopt the model parameter settings in subsection 4.2.2 and set F=8 to generate the true states; however, we use different values of F when we estimate the states, such as F=4,5,…,12.

    Figure 5 shows the results of the time-averaged A-RMSE over 100 000 time steps of the four methods with different forcing term values when the sample size is 20. When the sample size is small, the time-averaged A-RMSE of the EnCR method is smallest among all methods. Moreover, the time-averaged A-RMSE values of the SLS and EnKF methods are significantly larger than those of the other two methods. It is also very interesting that the minimum value of the time-averaged A-RMSE of the SLS and EnKF methods is not achieved at the point when F=8, in which the value of F is same with that of the model when we generate the data. This is a little inconsistent with our intuition and is possibly because when the sample size is very small, both methods are degenerated (as shown in Fig. 3), and thus the observation information contributes very little to the state estimation process. Next, we verify this conjecture by increasing the sample size in the following simulations.

    Figure 5.  Time-averaged A-RMSE over 100 000 time steps of the four methods for different settings of the forcing term F when the sample size is 20.

    Figures 6-7 show the results when the sample size is 80 and 100, respectively. First, we can see that when the sample size is large, the EnKF and SLS methods achieve their minimum time-averaged A-RMSE when F=8, which verifies the above conjecture. When F is around the true value, there has very little difference in estimation among the SLS, W-B and EnCR methods. However, when F is much smaller than 8, the W-B method achieves the smallest time-averaged A-RMSE, but the results of the EnCR and W-B methods are almost indistinguishable. When F is much larger than 8, the EnCR method achieves the best precision among all the methods. Based on the above results, we can conclude that the EnCR method demonstrates preferable performance and robust estimation results compared with the other methods.

    Figure 6.  Time-averaged A-RMSE over 100 000 time steps of the four methods for different settings of the forcing term F when the sample size is 80.

    Figure 7.  Time-averaged A-RMSE over 100 000 time steps of the four methods for different settings of the forcing term F when the sample size is 100.

  • From the experiments using the Lorenz-63 model, we can draw the following conclusions:

    (1) Overall, the EnCR method performs best. The SLS method ranks second and, when the observation error variance is small, the precision values of the SLS and EnCR methods are very close. However, when the observation error variance is large, the performance of the EnCR method is clearly better than the other methods.

    (2) With the model error variance increases, the precision of the EnCR and SLS methods tends to be similar and better than that of the W-B and EnKF methods.

    (3) The EnCR, W-B and SLS methods are not very sensitive to the initial condition settings, and the estimation results of the EnCR method are better than those of the other two methods under different initial deviation settings, which suggests the new method produces a more robust and highly precise estimation result.

    From the experiments using the Lorenz-96 model, we can draw the following conclusions:

    (1) When the model forcing parameter is correctly specified and the sample size is small, the EnCR method is better than the other methods. With the sample size increases, the precision values of the W-B, SLS and EnCR methods become very close to one another.

    (2) When the model forcing parameter is incorrectly specified, the EnCR method is better than the other methods when the sample size is small. When F used in the model is smaller than the true value and the sample size is large, the W-B method obtains the best results, and the results of the EnCR method are very close to the results of the W-B method in that case. Moreover, when F is larger than the true value, the EnCR method achieves the best results and is clearly better than the other methods.

5. Conclusions
  • In this paper, we propose a new method to estimate the forecast error variance inflation parameter in the classical EnKF method. In the simulation experiments, the new method significantly improves the accuracy of the EnKF method and achieves robust estimation results when the initial settings and model parameters are incorrectly specified. Moreover, based on our simulation process, when the forecast model or the initial settings are very poor, the adjustment parameter estimated by the new method can be very large, and in this situation the observation information is fully used to obtain a robust estimate. Therefore, the parameter estimate of the new method is more flexible than the other methods. However, although the new method performs quite well in the simulations, when dealing with high-dimensional problems the inverse of \(\lambda_tH\hat{P}_t|t-1H'+R_t\) in the algorithm will be expensive if the dimension of the observation is also very high. In this case, our conjecture that some diagonal substitution of \(\lambda_tH\hat{P}_t|t-1H'+R_t\) could possibly be adopted to save on computation time if \(\lambda_tH\hat{R}_t|t-1H'+R_t\) is diagonal dominant. We intend to consider the situation of a high-dimensional case in future research.

    Using the idea of a confidence region to estimate the parameter of the EnKF method can also be extended to other unknown parameter estimation problems within data assimilation methods. In our future studies, the idea of a confidence region will be further studied in a similar direction.

Reference

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return