Advanced Search
Article Contents

Incorporation of Parameter Uncertainty into Spatial Interpolation Using Bayesian Trans-Gaussian Kriging


doi: 10.1007/s00376-014-4040-4

  • Quantitative precipitation estimation (QPE) plays an important role in meteorological and hydrological applications. Ground-based telemetered rain gauges are widely used to collect precipitation measurements. Spatial interpolation methods are commonly employed to estimate precipitation fields covering non-observed locations. Kriging is a simple and popular geostatistical interpolation method, but it has two known problems: uncertainty underestimation and violation of assumptions. This paper tackles these problems and seeks an optimal spatial interpolation for QPE in order to enhance spatial interpolation through appropriately assessing prediction uncertainty and fulfilling the required assumptions. To this end, several methods are tested: transformation, detrending, multiple spatial correlation functions, and Bayesian kriging. In particular, we focus on a short-term and time-specific rather than a long-term and event-specific analysis. This paper analyzes a stratiform rain event with an embedded convection linked to the passing monsoon front on the 23 August 2012. Data from a total of 100 automatic weather stations are used, and the rainfall intensities are calculated from the difference of 15 minute accumulated rainfall observed every 1 minute. The one-hour average rainfall intensity is then calculated to minimize the measurement random error. Cross-validation is carried out for evaluating the interpolation methods at regional and local levels. As a result, transformation is found to play an important role in improving spatial interpolation and uncertainty assessment, and Bayesian methods generally outperform traditional ones in terms of the criteria.
  • 加载中
  • Basistha A., D. S. Arya, N. K. Goel, 2008: Spatial distribution of rainfall in Indian Himalayas——A case study of Uttarakhand Region. Water Resour. Manag., 22, 1325- 1346.
    Box G. E. P., D. R. Cox, 1964: An analysis of transformations. Journal of the Royal Statistical Society (B), 26, 211- 252.
    Buytaert W., R. Celleri, P. Willems, B. de Bi\`evre, G. Wyseure, 2006: Spatial and temporal rainfall variability in mountainous areas: A case study from the south Ecuadorian Andes. J. Hydrol., 329, 413- 421.
    Carlson R. E., T. A. Foley, 1991: The parameter R2 in multiquadric interpolation. Computers & Mathematics with Applications, 21, 29- 42.
    Chil\`es, J. P., P. Delfiner, 1999: Geostatistics: Modeling Spatial Uncertainty. John Wiley and Sons, New York, 731 pp.
    Cox D. R., D. V. Hinkley, 1979: Theoretical Statistics. Chapman and Hall, London, 528 pp.
    Cressie N., 1993: Statistics for Spatial Data. Wiley-Interscience, 928 pp.
    Cressie N., H.-C. Huang, 1999: Classes of nonseparable, spatio-temporal stationary covariance function. Journal of the American Statistical Association, 94, 1330- 1340.
    Diggle P. J., J. A. Tawn, R. A. Moyeed, 1998: Model-based geostatistics (with discussion). Applied Statistics, 47, 299- 350.
    Dirks K. N., J. E. Hay, C. D. Stow, D. Harris, 1998: High-resolution of rainfall on Norfolk Island, Part II: Interpolation of rainfall data. J. Hydrol., 208, 187- 193.
    Erdin R., C. Frei, 2012: Data transformation and uncertainty in geostatistical combination of radar and rain gauges. J. Hydrometeor., 13, 1332- 1346.
    Franke R., 1982: Scattered data interpolation: Test of some methods. Mathematics of Computations, 33, 181- 200.
    Genton M. G., 2007: Separable approximations of space-time covariance matrices. Environmetrics, 18, 681- 695.
    Guttorp P., W. Meiring, P. D. Sampson, 1994: A space-time analysis of ground-level ozone data. Environmetrics, 5, 241- 254.
    Hcock, M. S., M. L. Stein, 1993: A Bayesian analysis of kriging. Technometrics, 35, 403- 410.
    Isaaks E. H., R. M. Srivastava, 1989: Introduction to Applied Geostatistics. Oxford University Press, Oxford, 561 pp.
    Ly S., C. Charles, A. Degr\'e, 2011: Geostatistical interpolation of daily rainfall at catchment scale: The use of several variogram models in the Ourthe and Ambleve catchments, Belgium. Hydrology and Earth System Sciences, 15, 2259- 2274.
    Ma C. S., 2008: Recent developments on the construction of spatio-temporal covariance models. Stochastic Environmental Research and Risk Assessment, 22, S39- S47.
    Nalder I. A., R. W. Wein, 1998: Spatial interpolation of climatic Normals: Test of a new method in the Canadian boreal forest. Agricultural and Forest Meteorology, 92, 211- 225.
    Schabenberger O., C. A. Gotway, 2004: Statistical Methods for Spatial Data Analysis. Chapman and Hall, 512 pp.
    Schuurmans J. M., M. F. P. Bierkens, E. J. Pebesma, R. Uijlenhoet, 2007: Automatic prediction of high-resolution daily rainfall fields for multiple extents: The potential of operational radar. J. Hydrometeor., 8, 1204- 1224.
    Verworn A., U. Haberlandt, 2011: Spatial interpolation of hourly rainfall-effect of additional information, variogram inference and storm properties. Hydrology and Earth System Sciences, 15, 569- 584.
    Wang F. J., M. M. Wall, 2003: Incorporating parameter uncertainty into prediction intervals for spatial data modeled via a parametric variogram. Journal of Agricultural, Biological, and Environmental Statistics, 8, 296- 309.
    Xie H., X. Zhang, B. Yu, H. Sharif, 2011: Performance evaluation of interpolation methods for incorporating rain gauge measurements into NEXRAD precipitation data: A case study in the upper Guadalupe river basin. Hydrological Processes, 25, 3711- 3720.
    Yao T., 1998: Automatic modeling of (cross) covariance tables using fast Fourier transform. Mathematical Geology, 30, 589- 615.
    Yilmaz H. M., 2007: The effect of interpolation methods in surface definition: An experimental study. Earth Surface Process and Landforms, 32, 1346- 1361.
  • [1] LI Tao, ZHENG Xiaogu, DAI Yongjiu, YANG Chi, CHEN Zhuoqi, ZHANG Shupeng, WU Guocan, WANG Zhonglei, HUANG Chengcheng, SHEN Yan, LIAO Rongwei, 2014: Mapping Near-surface Air Temperature, Pressure, Relative Humidity and Wind Speed over Mainland China with High Spatiotemporal Resolution, ADVANCES IN ATMOSPHERIC SCIENCES, 31, 1127-1135.  doi: 10.1007/s00376-014-3190-8
    [2] JIE Weihua, WU Tongwen, WANG Jun, LI Weijing, LIU Xiangwen, 2014: Improvement of 6-15 Day Precipitation Forecasts Using a Time-Lagged Ensemble Method, ADVANCES IN ATMOSPHERIC SCIENCES, 31, 293-304.  doi: 10.1007/s00376-013-3037-8
    [3] LIU Ge, WU Renguang, ZHANG Yuanzhi, and NAN Sulan, 2014: The Summer Snow Cover Anomaly over the Tibetan Plateau and Its Association with Simultaneous Precipitation over the Mei-yu-Baiu region, ADVANCES IN ATMOSPHERIC SCIENCES, 31, 755-764.  doi: 10.1007/s00376-013-3183-z
    [4] Chang-Kyun PARK, Minhee CHANG, Chang-Hoi HO, Kyung-Ja HA, Jinwon KIM, Byung-Ju SOHN, 2021: Two Types of Diurnal Variations in Heavy Rainfall during July over Korea, ADVANCES IN ATMOSPHERIC SCIENCES, 38, 2201-2211.  doi: 10.1007/s00376-021-1178-8
    [5] REN Guoyu, DING Yihui, ZHAO Zongci, ZHENG Jingyun, WU Tongwen, TANG Guoli, XU Ying, 2012: Recent Progress in Studies of Climate Change in China, ADVANCES IN ATMOSPHERIC SCIENCES, 29, 958-977.  doi: 10.1007/s00376-012-1200-2
    [6] Athanassios A. ARGIRIOU, Zhen LI, Vasileios ARMAOS, Anna MAMARA, Yingling SHI, Zhongwei YAN, 2023: Homogenised Monthly and Daily Temperature and Precipitation Time Series in China and Greece since 1960, ADVANCES IN ATMOSPHERIC SCIENCES, 40, 1326-1336.  doi: 10.1007/s00376-022-2246-4
    [7] Meng YAN, Johnny C. L. CHAN, Kun ZHAO, 2020: Impacts of Urbanization on the Precipitation Characteristics in Guangdong Province, China, ADVANCES IN ATMOSPHERIC SCIENCES, 37, 696-706.  doi: 10.1007/s00376-020-9218-3
    [8] Tian FENG, Fumin REN, Da-Lin ZHANG, Guoping LI, Wenyu QIU, Hui YANG, 2020: Sideswiping Tropical Cyclones and Their Associated Precipitation over China, ADVANCES IN ATMOSPHERIC SCIENCES, 37, 707-717.  doi: 10.1007/s00376-020-9224-5
    [9] WANG Shaowu, ZHU Jinhong, CAI Jingning, 2004: Interdecadal Variability of Temperature and Precipitation in China since 1880, ADVANCES IN ATMOSPHERIC SCIENCES, 21, 307-313.  doi: 10.1007/BF02915560
    [10] ZHANG Xinping, JIN Huijun, SUN Weizhen, 2006: Stable Isotopic Variations in Precipitation in Southwest China, ADVANCES IN ATMOSPHERIC SCIENCES, 23, 649-658.  doi: 10.1007/s00376-006-0649-2
    [11] GE Quansheng, WANG Shaowu, WEN Xinyu, Caiming SHEN, HAO Zhixin, 2007: Temperature and Precipitation Changes in China During the HoloceneTemperature and Precipitation Changes in China During the Holocene, ADVANCES IN ATMOSPHERIC SCIENCES, 24, 1024-1036.  doi: 10.1007/s00376-007-1024-7
    [12] Xiaoling YANG, Botao ZHOU, Ying XU, Zhenyu HAN, 2021: CMIP6 Evaluation and Projection of Temperature and Precipitation over China, ADVANCES IN ATMOSPHERIC SCIENCES, 38, 817-830.  doi: 10.1007/s00376-021-0351-4
    [13] Xingmin LI, Yan DONG, Zipeng DONG, Chuanli DU, Chuang CHEN, 2016: Observed Changes in Aerosol Physical and Optical Properties before and after Precipitation Events, ADVANCES IN ATMOSPHERIC SCIENCES, 33, 931-944.  doi: 10.1007/s00376-016-5178-z
    [14] ZHANG Xinping, LIU Jingmiao, TIAN Lide, HE Yuanqing, YAO Tandong, 2004: Variations of 18O in Precipitation along Vapor Transport Paths, ADVANCES IN ATMOSPHERIC SCIENCES, 21, 562-572.  doi: 10.1007/BF02915724
    [15] YIN Shuiqing, LI Weijing, Deliang CHEN, Jee-Hoon JEONG, GUO Wenli, 2011: Diurnal Variations of Summer Precipitation in the Beijing Area and the Possible Effect of Topography and Urbanization, ADVANCES IN ATMOSPHERIC SCIENCES, 28, 725-734.  doi: 10.1007/s00376-010-9240-y
    [16] FU Danhong, GUO Xueliang, 2006: A Cloud-resolving Study on the Role of Cumulus Merger in MCS with Heavy Precipitation, ADVANCES IN ATMOSPHERIC SCIENCES, 23, 857-868.  doi: 10.1007/s00376-006-0857-9
    [17] SONG Lianchun, A. J. CANNON, P. H. WHITFIELD, 2007: Changes in Seasonal Patterns of Temperature and Precipitation in China During 1971--2000, ADVANCES IN ATMOSPHERIC SCIENCES, 24, 459-473.  doi: 10.1007/s00376-007-0459-1
    [18] Geng Quanzhen, Ding Yihui, Huang Chaoying, 1997: Influences of the Extratropical Pacific SST on the Precipitation of the North China Region, ADVANCES IN ATMOSPHERIC SCIENCES, 14, 339-349.  doi: 10.1007/s00376-997-0054-5
    [19] Min Luo, Yuzhi Liu, Jie Gao, Run Luo, Jinxia Zhang, Ziyuan Tan, Siyu Chen, Khan Alam, 2024: A New Merged Product Revealed Precipitation Features over Drylands in China, ADVANCES IN ATMOSPHERIC SCIENCES.  doi: 10.1007/s00376-024-3159-1
    [20] YUAN Fang, CHEN Wen, ZHOU Wen, 2012: Analysis of the Role Played by Circulation in the Persistent Precipitation over South China in June 2010, ADVANCES IN ATMOSPHERIC SCIENCES, 29, 769-781.  doi: 10.1007/s00376-012-2018-7

Get Citation+

Export:  

Share Article

Manuscript History

Manuscript received: 04 March 2014
Manuscript revised: 17 June 2014
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Incorporation of Parameter Uncertainty into Spatial Interpolation Using Bayesian Trans-Gaussian Kriging

  • 1. Department of Statistical Science, Baylor University, USA
  • 2. Department of Astronomy and Atmospheric Sciences, Research and Training Team for Future Creative Astrophysicists and Cosmologist, Kyungpook National University, Republic of Korea

Abstract: Quantitative precipitation estimation (QPE) plays an important role in meteorological and hydrological applications. Ground-based telemetered rain gauges are widely used to collect precipitation measurements. Spatial interpolation methods are commonly employed to estimate precipitation fields covering non-observed locations. Kriging is a simple and popular geostatistical interpolation method, but it has two known problems: uncertainty underestimation and violation of assumptions. This paper tackles these problems and seeks an optimal spatial interpolation for QPE in order to enhance spatial interpolation through appropriately assessing prediction uncertainty and fulfilling the required assumptions. To this end, several methods are tested: transformation, detrending, multiple spatial correlation functions, and Bayesian kriging. In particular, we focus on a short-term and time-specific rather than a long-term and event-specific analysis. This paper analyzes a stratiform rain event with an embedded convection linked to the passing monsoon front on the 23 August 2012. Data from a total of 100 automatic weather stations are used, and the rainfall intensities are calculated from the difference of 15 minute accumulated rainfall observed every 1 minute. The one-hour average rainfall intensity is then calculated to minimize the measurement random error. Cross-validation is carried out for evaluating the interpolation methods at regional and local levels. As a result, transformation is found to play an important role in improving spatial interpolation and uncertainty assessment, and Bayesian methods generally outperform traditional ones in terms of the criteria.

1. Introduction
  • Quantitative precipitation estimation (QPE) plays an important role in meteorological and hydrological applications. Rain gauges are widely used to collect precipitation measurements due to certain advantages. For instance, rain gauges directly measure rainfall on the ground, thus providing accurate ground-level precipitation observations with limited error.

    Precipitation varies significantly in time and space. Hence, spatial interpolation methods are commonly employed to estimate precipitation in locations lacking measurement equipment. Examples of such interpolation methods include inverse distance weighting (Franke, 1982), local polynomial (Yilmaz, 2007), and radial basis function (Carlson and Foley, 1991). Spatial interpolation methods are typically classified into two categories: deterministic and stochastic. The deterministic methods provide no assessment of possible errors, while the stochastic methods offer probabilistic estimates (Dirks et al., 1998; Nalder and Wein, 1998; Buytaert et al., 2006; Basistha et al., 2008; Ly et al., 2011). In this paper, we focus on geostatistical stochastic methods, in particular ordinary kriging (OK) and its variants are considered as a spatial interpolation tool.

    Kriging is a simple and popular geostatistical interpolation method. There are several variants, such as simple kriging, ordinary kriging, universal kriging, and indicator kriging (Cressie, 1993; Schabenberger and Gotway, 2004). This paper tackles two problems that are often neglected in kriging analysis——uncertainty underestimation and violation of assumptions——in a single framework.

    Although kriging is widely used for spatial interpolation, the spatial structure of the underlying process is presumed known, leading to a plug-in or two-stage procedure. Hence, kriging is often performed after estimating the parameters of the spatial structure. In this framework, the uncertainty in spatial interpolation is often underestimated because the procedure ignores uncertainty in the parameter estimation related to spatial structure, as if the parameters were the true values. This is an optimistic assessment of predictive accuracy. To overcome this problem, there are several ways to account for the uncertainty, including bootstrapping (Wang and Wall, 2003) and Bayesian statistics (Diggle et al., 1998; Handcock and Stein, 1993). The Bayesian approach allows explicit accounting of uncertainties in the model parameters by treating them as random quantities, rather than as unknown constants as in the classical approach. Within this framework, parameter estimation and prediction can be conducted simultaneously without the separation in the two-stage procedure above. This framework accounts for the uncertainty ignored by model parameter estimation, leading to more realistic spatial prediction, with better uncertainty assessment. In addition, any a priori knowledge about the unknown quantities can be incorporated into the inference.

    The second problem is that spatial data sets in practice often violate the assumptions required for kriging, and it is neglected in spatial analysis. Kriging may be used to find the best linear unbiased predictor (BLUP), as it fully complies with the validity of all the required assumptions, such as normality, stationarity, and homoscedasticity (Isaaks and Srivastava, 1989). Furthermore, the effect of violating the assumption on spatial prediction would be substantial. This paper will investigate suitable remedies for the violations.

    The main objective of this study is to obtain an optimal spatial interpolation for QPE to assess prediction uncertainty appropriately and fulfill the required assumptions. To this end, we compare several kriging variants: (1) transformation; (2) detrending; (3) spatial correlation function; and (4) Bayesian kriging. Very few studies have explored these issues simultaneously in a single framework. We focus on a short-term and time-specific analysis rather than long-term and event-specific ones. The data and methodology are shown in section 2 and the analysis results are discussed in section 3.

2. Data and method
  • The rain gauge data are collected by tipping bucket rain gauges in an Automatic Weather System (AWS) with 0.5 mm resolution. Thus, 0.5 mm h-1 is used as the cut-off threshold to reject dry areas experiencing no rain from the analysis. A total of 100 AWS stations are used over the area of (34.33°-37.03°N, 126.84°-128.25°E). The rainfall intensities are calculated from the difference between the AWS-observed 15 minute accumulated rainfall amounts every minute. The one hour average rainfall intensity is calculated to minimize the measurement random error. Figure 1 displays the locations of the rain gauges in the study area. The event to be analyzed is a stratiform precipitation event with embedded convection related to the passage of the monsoon front that occurred on 23 August 2012. The convective line developed over the southeast of the domain associated with the monsoon front (Fig. 2). Two lines intensified from 0600-0900 LST, and the systems became weaker and spread widely. The rain band moved southeast to northwest. The total rainfall amounts were recorded up to a maximum of 173 mm in the analysis region.

    Figure 1.  Locations of the 100 rain gauges in the study area. The different symbols represent the total precipitation over 24 hours: circles have ≤50 mm, triangles ≤100 mm, and squares ≥100 mm.

    Figure 2.  Radar images of the estimated rainfall rate every three hours from 0000 to 2100 LST 23 August 2012.

    The precipitation data are summarized in Table 1. Since only wet stations are used in the analysis, the number of stations considered varies from 33 to 76 over the time steps of the study period. There was intense precipitation during some time steps, e.g. time steps 4 (42 mm h-1), 6 (59.5 mm h-1), and 13 (84.5 mm h-1). Data variation, measured by the standard deviation (SD), was not constant over time, ranging from 1.402 mm h-1 to 10.262 mm h-1. This finding motivates hourly-specific spatial analysis rather than aggregated analysis over all time steps in a single event.

  • Some inherent characteristics of precipitation lead to violation of the normality and constant variance (homoscedasticity) conditions necessary for OK to be BLUP. For instance, precipitation data are commonly skewed, due to their intermittency; data transformations have been commonly adopted to remedy this. Although square root and logarithmic transformations have been used (e.g., Schuurmans et al., 2007; Verworn and Haberlandt, 2011), they do not always sufficiently account for the departure from the assumptions. An alternative option is the family of power transformations, including the Box-Cox transformation, which is the most widely used (Box and Cox, 1964). This is given by: $$ Z=\left\{ \begin{array}{c@{}c} \dfrac{Y^\lambda-1}{\lambda}, & \lambda\ne0\\[3mm] \log(Y), & \lambda=0, \end{array} \right. $$ where Z is the transformed data, Y is the original data larger than the threshold, and Λ is the transformation parameter. This transformation is typically applied to positive data, which exclude dry areas in this study. The transformation is data-driven because the transformation parameter is determined according to profile log-likelihoods over the value within some ranges. In this paper, we consider a time-specific rather than an event-specific transformation, because the same transformation over time in an event seems to be unreasonable due to variation of the precipitation process.

  • Kriging requires spatial pattern information. An empirical variogram is first computed and fitted to a theoretical variogram in order to estimate spatial parameters, such as sill, range, and nugget, via $$ 2\hat{\gamma}(h)=\dfrac{1}{2|N(h)|}\sum_{(s_i,s_j)\in N(h)}{[Z(s_i)-Z(s_j)]^2} , $$ where N(h) is the set of pairs of observations Z(si) and Z(sj) such that distance between two locations d(si,sj) is equal to spatial lag h and |N(h)| is the number of the pairs. Variogram-based parameter estimation is generally inefficient because it is based on the smoothed variogram, which is not the original but a summary of the data. Likelihood-based methods are a general means to make use of the data generating process, but this approach requires a distributional assumption, such as normality. In this study, we adopt a maximum likelihood estimation that is widely implemented in statistical inference. Maximum likelihood estimators require a spatial distribution to construct the likelihood function. In this case, let Z=(Z(s1),…,Z(sn)) T denote the vector of transformed observations, with a multivariate normal distribution with mean μn, and covariance matrix \(\Sigma(\theta)\), where n is a n× 1 vector of ones, and θ is the vector of spatial parameters such as partial sill and range. The resulting log-likelihood function is given by \begin{eqnarray*} l(\mu,{\theta}|{Z})&=&-\dfrac{1}{2}\ln|\Sigma({\theta})|-\dfrac{n}{2}\ln 2\pi\\ &&-\dfrac{1}{2}({Z}-\mu{n})^{ T}\Sigma({\theta})^{-1}({Z}-\mu{n}) . \end{eqnarray*} The maximum likelihood estimators can be obtained by maximizing the log-likelihood function with respect to the parameters μ and θ. The resulting estimator has meaningful statistical properties under some mild regularity conditions (Cox and Hinkley, 1974).

    Exponential and spherical functions have often been used as spatial correlation functions in variogram modeling and kriging analysis (Chilès and Delfiner, 1999). In addition to these functions, the circular correlation function is considered in this study, a first in the study of rain. Table 2 presents the forms of the functions. Surprisingly, as shown in the results below, the new correlation function is the most frequently selected as the optimal spatial correlation function over time. For every time step, three correlation models are fitted, and one of them is selected according to a model selection criterion, known as the Bayesian Information Criterion (BIC). Therefore, a different correlation model is fitted to each time step.

    The mean of the function often spatially varies over the region of interest, while one of the required assumptions for BLUP is a constant mean. In this case, detrending can deal with the non-constant mean problem. A trend surface is commonly modeled using spatial coordinates or available covariate information, and the residuals between the observations and the fitted trends delineate the spatial structure. As a result, this detrending, or removal of trends, can reduce the variability of the predictive distribution. We examine the effect of detrending on spatial prediction for spatial coordinates, longitude and latitude.

  • OK is a linear interpolation method that is unbiased and minimizes the variance of the observations. The weightings of the linear interpolator are found by solving a system of equations with some constraints in order to achieve the sound properties. Several variants of OK are considered in this study. Trans-Gaussian OK (TOK) is a variant of kriging with a transformed Gaussian random field when the transformation is known (Cressie, 1993). Application of the Box-Cox transformation is assumed to transform non-normally distributed data into a normal distribution. Bayesian OK (BOK) and Bayesian trans-Gaussian OK (BTOK) perform OK and TOK from a Bayesian perspective. The kriging variants with detrending based on spatial coordinates are also considered.

    Bayesian analysis requires estimation of prior distributions for unknown parameters (θ). Combining the likelihood function L(θ|Z) with a prior distribution P(θ) leads to an expression for the posterior distribution p(θ|Z) via normalized Bayes' Theorem: $$ p({\theta}|{Z})=\dfrac{L({\theta}|{Z})p({\theta})}{\displaystyle\int L({\theta}|{Z})p({\theta})d{\theta}} . $$

    The posterior distribution provides a probability statement about the parameters and allows for uncertainty in all

    parameters. Similarly, the Bayesian predictive distribution p(Z(s0)|Z) for an arbitrary and unobserved location s0 can be obtained as $$ p(Z(s_0)|{Z})=\int p(Z(s_0)|{Z},{\theta})p({\theta}|{Z})d{\theta} . $$ In this study, we choose non-informative priors due to a lack of a priori information about the parameters.

  • Cross-validation is carried out to evaluate the influence of data transformation, detrending, spatial autocorrelation, and different interpolation methods (i.e. classical and Bayesian) on interpolation performance. Let Z(i,k) and \(\hat Z(i,k)\) denote the observed and predicted values from the leave-one-out cross-validation at the i th monitoring station in the k th time step, respectively. The mean absolute error (MAE) is then employed to evaluate the quality of the interpolations, $$ { MAE}_k=\dfrac{1}{n_k}\sum_i^{n_k}|\hat {Z}(i,k)-Z(i,k)| , $$ where nk is the number of wet stations in the k th time step. This criterion measures the unbiasedness of the cross-validation prediction. A correlation coefficient is used to evaluate the agreement between the observed and predicted values, $$ R_k^2=\dfrac{\displaystyle\sum_i^{n_k}{(Z(i,k)-\bar{Z}(k))}(\hat{Z}(i,k)-\bar{\hat{Z}}(k))} {\displaystyle\sqrt{\sum_i^{n_k}{(Z(i,k)-\bar{Z}(k))^2}}\sqrt{\sum_i^{n_k}{(\hat{Z}(i,k)-\bar{\hat{Z}}(k))^2}}} , $$ where \(\bar Z(k)\) and \(\bar{\hat Z}(k)\) are the means of the observed and predicted values over all the wet stations on the k th time step, respectively. This measure is the square of Pearson's correlation coefficient ranging between 0 and 1. Similar to the coefficient of determination in regression analysis, the coefficient measures how good a spatial interpolator is constructed from the observed data.

    To compare the prediction uncertainty, two measures are employed, the length of the prediction interval and the coverage probability. The prediction interval Lk is formed from the predicted value and its prediction error, and a wider interval represents greater uncertainty. The former is the average length of the cross-validation prediction interval over all wet stations at a given time step, $$ L_k=\dfrac{1}{n_k}\sum_i{(U(i,k)-L(i,k))} , $$ where U(i,k) and L(i,k) are the upper and lower bounds of the prediction intervals at station i at a given time step k, respectively. The coverage probability is computed by counting how many times the observed values fall in the prediction intervals, $$ P_k=\dfrac{1}{n_k}\sum_i{I_{[L(i,k),U(i,k)]}}(Z(i,k)) , $$ where IA(x) is the indicator function, which is 1 for x A, and 0 otherwise. It is expected that the coverage probability is close to the nominal value (e.g, 95%).

    The methods considered in this study are evaluated on two spatial scales, regional and local (Xie et al., 2011). The regional-scale evaluation is performed over the entire study area for each time step, while the local-scale evaluation compares the performance of the different spatial interpolation methods at each station over all time steps.

3. Results
  • Data transformation is employed to examine the assumptions required for the optimal spatial interpolation methods. Most statistical interpolation methods assume data are normally distributed. Figure 3 describes the distributions of raw and transformed precipitation datasets over 24 time steps using boxplots. The raw precipitation data include outliers in all time steps, but not the transformed data. This highlights the advantage of data transformation because outliers easily lead to non-normal data distributions. Robust distributions are more appropriate for data with outliers, such as the t-distribution.

    Figure 3.  Boxplots for the raw and transformed data over 24 hour time steps.

    Other statistical measures and testing are further used to study the impact of the power transformation on distribution normality (Table 3). Skewness is a measure of the extent of symmetry of a distribution, and kurtosis is a descriptor of the shape of a distribution, measuring the peakedness or flatness of a distribution. Positive kurtosis indicates a peaked distribution, while negative kurtosis corresponds to a flat distribution. The normal distribution has zero skewness and kurtosis by definition. Table 3 reports that positive skewness and large kurtosis are found in raw data over all time steps, indicating asymmetric and positively skewed distributions. This is expected for rainfall data, which are typically skewed toward heavy rain. After transformation, such data have significantly reduced kurtosis. For testing normality quantitatively, the Kolmogorov-Smirnov test is performed, and the results are presented in Table 3, before and after transformation. For raw data, none of the stations' data have a normal distribution, although two stations fulfill that assumption at the 0.05 significance level, after applying the transformation. Although the remaining stations have p-values less than 0.05, the symmetry is much enhanced in terms of both skewness and kurtosis. Figure 4 presents normal quantile-quantile plots of the raw and Box-Cox transformed datasets over two time steps with transformation parameter estimates. It is clear that the transformation improves the normality. A significant deviation from normality is found in the tails of the distribution of the raw data, though this is greatly reduced in the transformed data.

    Figure 4.  Normal quantile-quantile (Q-Q) plots of raw (left) and Box-Cox transformed (right) data at two time steps (0500 LST above and 0900 LST below). The estimated transformation parameters (\(\Lambda\)) are 0.075 and 0.03, respectively.

  • To optimize fitting of the variogram, we consider two potential factors that can improve spatial prediction. The first is the spatial correlation function, which models spatial structure, and therefore leads to poor spatial prediction if specified incorrectly. This effect is more pronounced for classical estimation, which assumes that the selected function is true in the prediction stage. Surprisingly, the spatial correlation function varies significantly in time. We find that the under-used circular correlation function is most suitable (Table 4). Detrending does not significantly affect the selection, whereas transformation does.

    The significance of the trend (i.e. the variation of the mean in space) is investigated at each time step, and Table 5 summarizes the result. Over 67% of raw and 83% of transformed time steps have a significant trend effect. This motivates modeling of the trend surface in order to characterize the spatial structure for spatial interpolation. In particular, latitude is an important factor for the modeling trend in this analysis. Figure 1 illustrates the significant latitudinal variation of the total rain amount.

    Figure 5.  Boxplots of the mean absolute error (top) and correlation coefficient (bottom) values on the regional scale. The horizontal thick bar indicates the median values. The bottom and top of the box indicate the first and third quartiles. The thin lower (upper) bar is the minimum (maximum) value. The circles indicates outliers. OK, Ordinary kriging; OKD, Ordinary kriging with detrend; TOK, Trans-Gaussian kriging; TOKD, Trans-Gaussian kriging with detrend; BOK, Bayesian ordinary kriging; BOKD, Bayesian ordinary kriging with detrend; BTOK, Bayesian trans-Gaussian ordinary kriging; and BTOKD, Bayesian trans-Gaussian ordinary kriging with detrend.

    Figure 6.  Boxplots of mean absolute error (top) and correlation coefficient (bottom) values at the local scale. The symbols are the same as in Fig. 4.

    Figure 7.  95% coverage probability for the cross-validation prediction intervals. The estimated probabilities are averaged over stations for a given time step. The vertical and horizontal lines are nominal probability 0.95.

  • The comparison of different spatial interpolation methods is shown in terms of MAE in Fig. 5 as a box-plot. The BTOK approach results in the smallest median, whereas the TOKD approach has the largest. In general, the Bayesian methods outperform the traditional methods in terms of MAE. The distributions of the MAE values of BTOK and TOK are narrower than those of the other methods. There are no outlying values (circles in Fig. 5) of MAE in BTOK and BTOKD.

    The BTOK has the largest correlation coefficient and the smallest variation. Detrending does not improve the correlation. Particularly, BOKD and BTOKD have low correlation coefficient values. Overall, the Bayesian methods perform better than the classical methods, in terms of both criteria. In contrast to detrending, transformation enhances the unbiasedness of the spatial interpolation.

    Local-scale evaluation at each rain gauge is shown in Fig. 6. Similar to the regional scale, transformation-based methods without detrending have, in general, a smaller MAE on the local scale, and Bayesian methods overall outperform the classical methods. TOK and BTOK perform best in terms of the local MAE. As for correlation coefficients, TOK and BTOK have larger values than the other methods. Detrending has a negative effect on both MAE and the correlation coefficient, whereas transformation and the Bayesian methods provide some improvement.

    Two aspects of uncertainty estimation are compared. First, the average coverage probabilities for each time step are compared for the classical and Bayesian methods for each kriging variant. Second, prediction intervals are constructed with 95% nominal probability. Hence, a good prediction interval has a coverage probability close to the nominal probability.

    Overall, the Bayesian methods generally have reasonable ranges of coverage probability, while the classical methods have some extreme probabilities (Fig. 7). The coverage probabilities for the Bayesian methods (y-axis) are reasonably spread with some consistency, while those of the classical methods range too widely with extreme values.

    The average lengths of the prediction intervals based on the spatial interpolation methods are shown in Fig. 8. As addressed earlier, the width of the intervals represents uncertainty in the prediction. As expected, Bayesian methods have generally longer widths than the classical ones, and this indicates that the Bayesian approach evaluates a more realistic uncertainty in spatial prediction. There is no systematic effect of transformation and detrending on the length.

    Figure 8.  Average length of the cross-validation prediction intervals.

    The scatterplot of prediction intervals (Fig. 9) ensures that Bayesian methods have longer intervals. There are some significant length differences between BTOK and TOK. Overall, the Bayesian approaches have realistic and larger uncertainty estimation. Figure 10 presents the spatial precipitation estimated by BTOK for four time steps.

    Figure 9.  Scatterplots of the length of the prediction intervals.

    Figure 10.  Spatial precipitation estimated by BTOK at four time steps: 0500 LST (upper-left); 1100 LST (upper-right); 1700 LST (lower-left); 2300 LST (lower-right).

4. Summary and discussion
  • In this paper, we investigate two problems commonly found and often neglected in the spatial interpolation of QPE: uncertainty underestimation and violation of assumptions, along with their effects on spatial prediction and uncertainty estimation. The methods addressed are considered to resolve the problems, and implemented in a single framework. The proposed method is illustrated with a rain gauge dataset consisting of 100 AWS stations in South Korea. A stratiform precipitation event is analyzed with one hour average rainfall intensity in order to minimize measurement error. The proposed kriging variants are applied to the dataset and compared with several criteria at regional and local levels.

    Overall, the methods improve spatial interpolation. For instance, we find that transformation plays an important role in improving spatial interpolation and is not constant over each time and event. Hence, it is challenging to find an appropriate transformation for each time step and event. The same transformation is commonly applied to all time steps in the event. Time-specific transformation is recommended, even though this may demand slightly more computational resources. In summary, a desirable method for a reliable QPE must be flexible and sophisticated enough to account for dynamic precipitation processes, as opposed to a static approach that assumes the same transformation or spatial structure over time in a given event.

    As used in this study, parametric spatial functions typically model spatial structure. However, they are often too restrictive to explain the structure adequately. A flexible spatial structure function is necessary to characterize the spatial structure of precipitation, e.g. by using a nonparametric approach (Yao, 1998). As a parametric approach, the Matèrn class of covariance functions is a reasonable alternative (Handcock and Stein, 1993).

    Detrending is used as a means of correcting the non-stationarity of the underlying process. In this study, the trend is modeled using only the spatial coordinates, longitude and latitude. Although detrending appears to influence the selection of the spatial correlation function, this method barely impacts on the evaluation of spatial interpolation. However, it is possible that detrending can be a powerful tool if the trend is modeled adequately with the other potential variables. For the sake of convenience, only spatial coordinate variables are considered in this study. Elevation and weather radar data could help to characterize the precipitation field and supplement spatial variation.

    A Bayesian approach is introduced to assess prediction uncertainty related to the uncertainty of parameter estimation in variogram analysis. There are some obstacles when conducting Bayesian prediction practically, to the first of which is eliciting prior information. When there is little a priori information about the parameters, a non-informative or diffuse prior distribution is often chosen regularly. In this case, the subsequent result is dominated by likelihood or data information when the sample size is sufficiently large. Furthermore, although the sample size is large in spatial prediction, the effective sample size is much smaller than the original sample size due to the high correlation between the spatial data. It is clear that prior information enhances not only parameter estimation but also the spatial prediction. The second concern is that the Bayesian approach usually requires intensive computation for evaluating the integral related to finding posterior and predictive distribution. If some convenient prior data are used, a conjugate prior distribution, the evaluation becomes simple because they can be evaluated analytically. Otherwise, more feasible evaluation strategies are necessary, such as Markov chain Monte Carlo (MCMC) methods. For large sample sizes, the problem becomes more serious due to high dimensionality leading to inversion of the high dimension matrix.

    Temporal processes are often modeled to account for the temporal variation of the precipitation process. It is obvious that temporal processes can enhance the modeling and prediction of the underlying process if the level of temporal variation is significant and the corresponding temporal process can take account of the variation appropriately. There are a variety of spatial-temporal processes available for this purpose (Guttorp et al., 1994; Cressie and Huang, 1999; Genton, 2007; Ma, 2008). However, it is necessary to be aware that these do not always improve the spatial prediction because complicated models can worsen it. The complexity of the processes also increases the computation time, which can be an important concern in practical operations.

    In QPE, it is usual to observe outliers in the precipitation dataset. Classical variogram estimation is sensitive to outliers, which can propagate in spatial predictions. TOK- and BTOK-transformed datasets have no outliers. As such, transformation is helpful to not only comply with the required assumptions, but also to deal with outliers. Transformation can also remedy heteroscedasticity, required in kriging for fulfilling second-order stationarity (Erdin and Frei, 2012).

Reference

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return