Advanced Search
Article Contents

CLDASSD: Reconstructing Fine Textures of the Temperature Field Using Super-Resolution Technology


doi: 10.1007/s00376-021-0438-y

  • Before 2008, the number of surface observation stations in China was small. Thus, the surface observation data were too sparse to effectively support the High-resolution China Meteorological Administration's Land Assimilation System (HRCLDAS) which ultimately inhibited the output of high-resolution and high-quality gridded products. This paper proposes a statistical downscaling model based on a deep learning algorithm in super-resolution to research the above problem. Specifically, we take temperature as an example. The model is used to downscale the 0.0625° × 0.0625°, 2-m temperature data from the China Meteorological Administration's Land Data Assimilation System (CLDAS) to 0.01° × 0.01°, named CLDASSD. We performed quality control on the paired data from CLDAS and HRCLDAS, using data from 2018 and 2019. CLDASSD was trained on the data from 31 March 2018 to 28 February 2019, and then tested with the remaining data. Finally, extensive experiments were conducted in the Beijing-Tianjin-Hebei region which features complex and diverse geomorphology. Taking the HRCLDAS product and surface observation data as the “true values” and comparing them with the results of bilinear interpolation, especially in complex terrain such as mountains, the root mean square error (RMSE) of the CLDASSD output can be reduced by approximately 0.1°C, and its structural similarity (SSIM) was approximately 0.2 higher. CLDASSD can estimate detailed textures, in terms of spatial distribution, with greater accuracy than bilinear interpolation and other sub-models and can perform the expected downscaling tasks.
    摘要: 2008年以前,我国地面观测站数量较少,地面观测资料分布稀疏。因此无法有效支持高分辨率中国气象局陆面同化系统(HRCLDAS)生产高分辨率和高质量的网格产品。针对上述问题,本文提出了一种基于超分辨率深度学习算法的统计降尺度模型。具体来说,我们以温度为例,该模型把来自中国气象局陆面同化系统(CLDAS)的0.0625°× 0.0625°,2 m气温数据降尺度到 0.01°× 0.01°,并命名为 CLDASSD。我们对来自CLDAS和HRCLDAS的配对数据进行了质量控制。这些数据的时间范围是2018年和2019年。CLDASSD用2018年3月31日至2019年2月28日的数据进行了训练,然后使用剩余数据进行测试,在地貌复杂多样的京津冀地区进行了广泛的试验。以HRCLDAS产品和观测资料为“真值”,与双线性插值的结果进行比较,特别是在山区等复杂地形中,CLDASSD输出的均方根误差(RMSE)可降低约0.1 ℃,其结构相似性(SSIM)大约高 0.2。CLDASSD 可以在空间分布方面估计更准确的细节纹理,在空间分布方面比双线性插值和其他子模型具有更高的精度,并且可以完成预期的降尺度任务。
  • 加载中
  • Figure 1.  A map of the Beijing-Tianjin-Hebei region assembled from high-definition satellite maps provided by Google Earth. The red box delineates our test area, in which the plateau in the northwest, the mountains in the middle, the plains in the south, and the Bohai Bay in the east are visible.

    Figure 2.  Graphical displays of the paired data from CLDAS (the spatial resolution is 0.0625° × 0.0625°) and HRCLDAS (0.01° × 0.01°) on 12 May 2018, as an example to show the abnormal samples that need to be eliminated in our experiments. The leftmost column shows the histogram comparison between the two resolution products. The middle column is the histogram of the residuals between them. The rightmost column is their scatter plot. (a), (b), (c), and (d) represent 0000, 0600, 1200, and 1800 UTC, respectively.

    Figure 3.  A violin plot of the entire data set before and after quality control. After our effective quality control, most of the outliers have been eliminated.

    Figure 4.  The design framework of CLDASSD. (a) shows the overall structure of the model; (b) shows the specific design details of the model's attention unit.

    Figure 5.  DEM map of the Beijing-Tianjin-Hebei region. We divided it into an 8 × 8 chessboard and assigned an ID to each patch.

    Figure 6.  These four figures show the evaluation results of different metrics of different models in daily times (All times are coordinated universal time, UTC). (a), (b), (c), and (d) represent RMSE, MAE, PSNR and SSIM, respectively. CLDASSD performs best at 0600 UTC, i.e., noon in the local area.

    Figure 7.  Analysis from 15 April 2019 used as an example to show the daily change of the spatial distribution of the temperature field in a fixed mountainous area (the area in the black box). (a), (b), (c), and (d) denote 0000, 0600, 1200, and 1800 UTC, respectively.

    Figure 8.  These four figures show the evaluation results of different metrics of different models in all seasons. (a), (b), (c), and (d) represent RMSE, MAE, PSNR, and SSIM, respectively. CLDASSD performs best in summer.

    Figure 9.  Analysis from the first day in each representative month of four seasons at 0600 UTC as an example to compare the seasonal change in reconstruction fields of bilinear interpolation and CLDASSD. The leftmost column is the product of HRCLDAS, the middle column is the output of bilinear interpolation, and the rightmost column is the output of CLDASSD. (a), (b), (c), and (d) denote the first day in January, April, July, October, respectively.

    Figure 10.  The line chart of bias compares the bias of bilinear interpolation and CLDASSD on the test set. Because a different amount of data is discarded by quality control each month, the axis scale is not uniform.

    Figure 11.  The frequency of COR and RMSE for bilinear interpolation and CLDASSD. It is clear that the output quality of CLDASSD is better than that of bilinear interpolation.

    Table 1.  Design framework of several ablation experiments (sub-models). It is used to evaluate the contribution of global skip connection and attention mechanism.

    ModelStructure
    CLDASSD_wThe entire network has only a simple stack of residual structures.
    CLDASSD_aBased on the residual structure, we added the attention unit we designed.
    CLDASSD_gBased on CLDASSD_w, a global skip connection is added.
    CLDASSDThis model has both a global skip connection and an attention mechanism.
    DownLoad: CSV

    Table 2.  The average statistics for each metric in the test set (bold stands for the best). It can be seen that there is an improvement in SSIM.

    MetricsBilinearCLDASSD_wCLDASSD_aCLDASSD_gCLDASSD
    RMSE1.371.341.341.311.30
    MAE0.970.950.950.940.93
    PSNR29.9030.6830.8131.0631.21
    SSIM0.350.570.590.590.60
    DownLoad: CSV

    Table 3.  Classification of all the patches in Fig. 5. We divided them into four types, i.e., mountain, plain, water body, and plateau.

    Terrain typeSet
    Mountain{5, 12, 13, 18, 19, 20, 21, 22, 25, 26, 27, 32, 33, 34, 40, 41, 48, 49, 56, 57, 60, 61}
    Plain{6, 7, 14, 15, 23, 28, 29, 30, 35, 36, 42, 43, 44, 50, 51, 52, 53, 55, 58, 59}
    Water body{31, 37, 38, 39, 45, 46, 47, 54, 62, 63}
    Plateau{0, 1, 2, 3, 4, 8, 9, 10, 11, 16, 17, 24}
    DownLoad: CSV

    Table 4.  The reconstruction field for each time was divided into four different terrain types mentioned in Table 3 for evaluation. Although the RMSE is the highest in the mountainous area, it has the biggest improvement compared with that in bilinear interpolation (bold stands for the best).

    MetricsTypeBilinearCLDASSD_wCLDASSD_aCLDASSD_gCLDASSD
    RMSEPlain1.011.011.011.00.99
    Water body1.191.181.191.191.17
    Mountain1.611.551.551.511.50
    Plateau1.351.331.331.331.32
    MAEPlain0.730.730.730.730.72
    Water body0.870.860.860.870.85
    Mountain1.161.131.131.11.11
    Plateau1.00.990.990.990.98
    DownLoad: CSV
  • Ahn, N., B. Kang, and K.-A. Sohn, 2018: Fast, accurate, and lightweight super-resolution with cascading residual network. Proc. 15th European Conf. on Computer Vision, Munich, Germany, Springer, 252−268, https://doi.org/10.1007/978-3-030-01249-6_16.
    Cheng, W. C., X. K. Shi, W. J. Zhang, Z. G. Wang, and P. Xing, 2020: An NWP precipitation products downscaling method based on deep learning. Journal of Tropical Meteorology, 36, 307−316, https://doi.org/10.16032/j.issn.1004-4965.2020.029. (in Chinese with English abstract
    Dong, C., C. C. Loy, K. M. He, and X. O. Tang, 2014: Learning a deep convolutional network for image super-resolution. Proc. 13th European Conf. on Computer Vision, Zurich, Switzerland, Springer, 184−199, https://doi.org/10.1007/978-3-319-10593-2_13.
    Dong, C., C. C. Loy, K. M. He, and X. O. Tang, 2016: Image super-resolution using deep convolutional networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38, 295−307, https://doi.org/10.1109/TPAMI.2015.2439281.
    Drozdzal, M., E. Vorontsov, G. Chartrand, S. Kadoury, and C. Pal, 2016: The importance of skip connections in biomedical image segmentation. Proc. 1st International Workshop Deep Learning and Data Labeling for Medical Applications, Athens, Greece, Springer, 179−187, https://doi.org/10.1007/978-3-319-46976-8_19.
    Han, S., C. X. Shi, B. Xu, S. Sun, T. Zhang, L. P. Jiang, and X. Liang, 2019: Development and evaluation of hourly and kilometer resolution retrospective and real-time surface meteorological blended forcing dataset (SMBFD) in China. Journal of Meteorological Research, 33, 1168−1181, https://doi.org/10.1007/s13351-019-9042-9.
    Han, S., B. C. Liu, C. X. Shi, Y. Liu, M. J. Qiu, and S. Sun, 2020: Evaluation of CLDAS and GLDAS datasets for near-surface air temperature over major land areas of China. Sustainability, 12, 4311, https://doi.org/10.3390/su12104311.
    Hu, Y. T., J. Li, Y. F. Huang, and X. B. Gao, 2020: Channel-wise and spatial feature modulation network for single image super-resolution. IEEE Transactions on Circuits and Systems for Video Technology, 30, 3911−3927, https://doi.org/10.1109/tcsvt.2019.2915238.
    Keys, R., 1981: Cubic convolution interpolation for digital image processing. IEEE Transactions on Acoustics, Speech, and Signal Processing, 29, 1153−1160, https://doi.org/10.1109/TASSP.1981.1163711.
    Kim, J., J. Lee, and K. M. Lee, 2016: Accurate image super-resolution using very deep convolutional networks. Proc. 2016 IEEE Conf. on Computer Vision and Pattern Recognition, Las Vegas, USA, IEEE, 1646−1654, https://doi.org/10.1109/CVPR.2016.182.
    Lai, W.-S., J.-B. Huang, N. Ahuja, and M.-H. Yang, 2017: Deep Laplacian pyramid networks for fast and accurate super-resolution. Proc. 2017 IEEE Conf. on Computer Vision and Pattern Recognition, Honolulu, USA, IEEE, 5835−5843, https://doi.org/10.1109/cvpr.2017.618.
    Mao, Z. R., 2019: Climate data downscaling through single image super-resolution. M.S. thesis, Wuhan University. (in Chinese with English abstract)
    Odena, A., V. Dumoulin, and C. Olah, 2016: Deconvolution and checkerboard artifacts. Distill, 1, e3, https://doi.org/10.23915/distill.00003.
    Reichstein, M., G. Camps-Valls, B. Stevens, M. Jung, J. Denzler, N. Carvalhais, and Prabhat, 2019: Deep learning and process understanding for data-driven Earth system science. Nature, 566, 195−204, https://doi.org/10.1038/s41586-019-0912-1.
    Shi, C. X., and Coauthors, 2019: A review of multi-source meteorological data fusion products. Acta Meteorologica Sinica, 77, 774−783, https://doi.org/10.11676/qxxb2019.043. (in Chinese with English abstract
    Singh, A., A. Albert, and B. L. White, 2019: Downscaling numerical weather models with GANs. Proc. American Geophysical Union, Fall Meeting 2019. [Available online from https://ams.confex.com/ams/2020Annual/webprogram/Manuscript/Paper365409/CI_2019_Alok.pdf]
    Tai, Y., J. Yang, and X. M. Liu, 2017a: Image super-resolution via deep recursive residual network. Proc. 2017 IEEE Conf. on Computer Vision and Pattern Recognition, Honolulu, USA, IEEE, 2790−2798, https://doi.org/10.1109/cvpr.2017.298.
    Tai, Y., J. Yang, X. M. Liu, and C. Y. Xu, 2017b: MemNet: A persistent memory network for image restoration. Proc. 2017 IEEE International Conf. on Computer Vision, Venice, Italy, IEEE, 4549−4557, https://doi.org/10.1109/iccv.2017.486.
    Vandal, T., E. Kodra, S. Ganguly, A. Michaelis, R. Nemani, and A. R. Ganguly, 2017: DeepSD: Generating high resolution climate change projections through single image super-resolution. Proc. 23rd ACM SIGKDD International Conf. on Knowledge Discovery and Data Mining, Halifax, NS, Canada, ACM, 663−1672, https://doi.org/10.1145/3097983.3098004.
    Wang, X. T., K. Yu, S. X. Wu, J. J. Gu, Y. H. Liu, C. Dong, Y. Qiao, and C. C. Loy, 2019: ESRGAN: Enhanced super-resolution generative adversarial networks. Proc. European Conf. on Computer Vision, Cham, Springer, 63−79, https://doi.org/10.1007/978-3-030-11021-5_5.
    Wang, Z., A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, 2004: Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing, 13, 600−612, https://doi.org/10.1109/TIP.2003.819861.
    Wang, Z. H., J. Chen, and S. C. H. Hoi, 2020: Deep learning for image super-resolution: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, https://doi.org/10.1109/TPAMI.2020.2982166.
  • [1] FAN Lijun, Deliang CHEN, FU Congbin, YAN Zhongwei, 2013: Statistical downscaling of summer temperature extremes in northern China, ADVANCES IN ATMOSPHERIC SCIENCES, 30, 1085-1095.  doi: 10.1007/s00376-012-2057-0
    [2] Lei HAN, Mingxuan CHEN, Kangkai CHEN, Haonan CHEN, Yanbiao ZHANG, Bing LU, Linye SONG, Rui QIN, 2021: A Deep Learning Method for Bias Correction of ECMWF 24–240 h Forecasts, ADVANCES IN ATMOSPHERIC SCIENCES, 38, 1444-1459.  doi: 10.1007/s00376-021-0215-y
    [3] Kanghui ZHOU, Jisong SUN, Yongguang ZHENG, Yutao ZHANG, 2022: Quantitative Precipitation Forecast Experiment Based on Basic NWP Variables Using Deep Learning, ADVANCES IN ATMOSPHERIC SCIENCES.  doi: 10.1007/s00376-021-1207-7
    [4] Kun-Hui YE, Chi-Yung TAM, Wen ZHOU, Soo-Jin SOHN, 2015: Seasonal Prediction of June Rainfall over South China: Model Assessment and Statistical Downscaling, ADVANCES IN ATMOSPHERIC SCIENCES, 32, 680-689.  doi: 10.1007/s00376-014-4047-x
    [5] Deliang CHEN, Christine ACHBERGER, Jouni R¨AIS¨ANEN, Cecilia HELLSTR¨OM, 2006: Using Statistical Downscaling to Quantify the GCM-Related Uncertainty in Regional Climate Change Scenarios: A Case Study of Swedish Precipitation, ADVANCES IN ATMOSPHERIC SCIENCES, 23, 54-60.  doi: 10.1007/s00376-006-0006-5
    [6] HUANG Jiayou, TAN Benkui, SUO Lingling, HU Yongyun, 2007: Monthly Changes in the Influence of the Arctic Oscillation on Surface Air Temperature over China, ADVANCES IN ATMOSPHERIC SCIENCES, 24, 799-807.  doi: 10.1007/s00376-007-0799-x
    [7] CHEN Hua, GUO Jing, XIONG Wei, GUO Shenglian, Chong-Yu XU, 2010: Downscaling GCMs Using the Smooth Support Vector Machine Method to Predict Daily Precipitation in the Hanjiang Basin, ADVANCES IN ATMOSPHERIC SCIENCES, 27, 274-284.  doi: 10.1007/s00376-009-8071-1
    [8] Jinhe YU, Lei BI, Wei HAN, Xiaoye ZHANG, 2022: Application of a Neural Network to Store and Compute the Optical Properties of Non-Spherical Particles, ADVANCES IN ATMOSPHERIC SCIENCES.  doi: 10.1007/s00376-021-1375-5
    [9] Yongku KIM, Balaji RAJAGOPALAN, GyuWon LEE, 2016: Temporal Statistical Downscaling of Precipitation and Temperature Forecasts Using a Stochastic Weather Generator, ADVANCES IN ATMOSPHERIC SCIENCES, 33, 175-183.  doi: 10.1007/s00376-015-5115-6
    [10] Nian LIU, Zhongwei YAN, Xuan TONG, Jiang JIANG, Haochen LI, Jiangjiang XIA, Xiao LOU, Rui REN, Yi FANG, 2022: Meshless Surface Wind Speed Field Reconstruction Based on Machine Learning, ADVANCES IN ATMOSPHERIC SCIENCES.  doi: 10.1007/s00376-022-1343-8
    [11] Jianfeng WANG, Ricardo M. FONSECA, Kendall RUTLEDGE, Javier MARTÍN-TORRES, Jun YU, 2020: A Hybrid Statistical-Dynamical Downscaling of Air Temperature over Scandinavia Using the WRF Model, ADVANCES IN ATMOSPHERIC SCIENCES, 37, 57-74.  doi: 10.1007/s00376-019-9091-0
    [12] ZHU Congwen, Chung-Kyu PARK, Woo-Sung LEE, Won-Tae YUN, 2008: Statistical Downscaling for Multi-Model Ensemble Prediction of Summer Monsoon Rainfall in the Asia-Pacific Region Using Geopotential Height Field, ADVANCES IN ATMOSPHERIC SCIENCES, 25, 867-884.  doi: 10.1007/s00376-008-0867-x
    [13] Leilei KOU, Yinfeng JIANG, Aijun CHEN, Zhenhui WANG, 2020: Statistical Modeling with a Hidden Markov Tree and High-resolution Interpolation for Spaceborne Radar Reflectivity in the Wavelet Domain, ADVANCES IN ATMOSPHERIC SCIENCES, 37, 1359-1374.  doi: 10.1007/s00376-020-0035-5
    [14] Xiaoyu REN, Yi LIU, Zhaonan CAI, Yuli ZHANG, 2022: Observations of Dynamic Turbulence in the Lower Stratosphere over Inner Mongolia Using a High-resolution Balloon Sensor Constant Temperature Anemometer, ADVANCES IN ATMOSPHERIC SCIENCES, 39, 519-528.  doi: 10.1007/s00376-021-1233-5
    [15] Cecilia HELLSTR?M, Deliang CHEN, 2003: Statistical Downscaling Based on Dynamically Downscaled Predictors: Application to Monthly Precipitation in Sweden, ADVANCES IN ATMOSPHERIC SCIENCES, 20, 951-958.  doi: 10.1007/BF02915518
    [16] Li Hongji, Xu Hong, Wang Ronghua, 1988: A HIGH-RESOLUTION ANALYSIS METHOD OF INSTABILITY ENERGY, ADVANCES IN ATMOSPHERIC SCIENCES, 5, 75-86.  doi: 10.1007/BF02657348
    [17] Haochen LI, Chen YU, Jiangjiang XIA, Yingchun WANG, Jiang ZHU, Pingwen ZHANG, 2019: A Model Output Machine Learning Method for Grid Temperature Forecasts in the Beijing Area, ADVANCES IN ATMOSPHERIC SCIENCES, 36, 1156-1170.  doi: 10.1007/s00376-019-9023-z
    [18] LI Rui, ZHANG Zuowei, WU Lixin, 2014: High-Resolution Modeling Study of the Kuroshio Path Variations South of Japan, ADVANCES IN ATMOSPHERIC SCIENCES, 31, 1233-1244.  doi: 10.1007/s00376-014-3230-4
    [19] Eric P. CHASSIGNET, Xiaobiao XU, 2021: On the Importance of High-Resolution in Large-Scale Ocean Models, ADVANCES IN ATMOSPHERIC SCIENCES, 38, 1621-1634.  doi: 10.1007/s00376-021-0385-7
    [20] Yuan QIU, Jinming FENG, Zhongwei YAN, Jun WANG, 2022: High-resolution Projection Dataset of Agroclimatic Indicators over Central Asia, ADVANCES IN ATMOSPHERIC SCIENCES.  doi: 10.1007/s00376-022-2008-3

Get Citation+

Export:  

Share Article

Manuscript History

Manuscript received: 10 March 2021
Manuscript revised: 08 July 2021
Manuscript accepted: 11 August 2021
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

CLDASSD: Reconstructing Fine Textures of the Temperature Field Using Super-Resolution Technology

    Corresponding author: Chunxiang SHI, shicx@cma.gov.cn
  • 1. Institute of Aerospace Information, Space Engineering University, Beijing 101400, China
  • 2. National Meteorological Information Center, China Meteorological Administration, Beijing 100081, China

Abstract: Before 2008, the number of surface observation stations in China was small. Thus, the surface observation data were too sparse to effectively support the High-resolution China Meteorological Administration's Land Assimilation System (HRCLDAS) which ultimately inhibited the output of high-resolution and high-quality gridded products. This paper proposes a statistical downscaling model based on a deep learning algorithm in super-resolution to research the above problem. Specifically, we take temperature as an example. The model is used to downscale the 0.0625° × 0.0625°, 2-m temperature data from the China Meteorological Administration's Land Data Assimilation System (CLDAS) to 0.01° × 0.01°, named CLDASSD. We performed quality control on the paired data from CLDAS and HRCLDAS, using data from 2018 and 2019. CLDASSD was trained on the data from 31 March 2018 to 28 February 2019, and then tested with the remaining data. Finally, extensive experiments were conducted in the Beijing-Tianjin-Hebei region which features complex and diverse geomorphology. Taking the HRCLDAS product and surface observation data as the “true values” and comparing them with the results of bilinear interpolation, especially in complex terrain such as mountains, the root mean square error (RMSE) of the CLDASSD output can be reduced by approximately 0.1°C, and its structural similarity (SSIM) was approximately 0.2 higher. CLDASSD can estimate detailed textures, in terms of spatial distribution, with greater accuracy than bilinear interpolation and other sub-models and can perform the expected downscaling tasks.

摘要: 2008年以前,我国地面观测站数量较少,地面观测资料分布稀疏。因此无法有效支持高分辨率中国气象局陆面同化系统(HRCLDAS)生产高分辨率和高质量的网格产品。针对上述问题,本文提出了一种基于超分辨率深度学习算法的统计降尺度模型。具体来说,我们以温度为例,该模型把来自中国气象局陆面同化系统(CLDAS)的0.0625°× 0.0625°,2 m气温数据降尺度到 0.01°× 0.01°,并命名为 CLDASSD。我们对来自CLDAS和HRCLDAS的配对数据进行了质量控制。这些数据的时间范围是2018年和2019年。CLDASSD用2018年3月31日至2019年2月28日的数据进行了训练,然后使用剩余数据进行测试,在地貌复杂多样的京津冀地区进行了广泛的试验。以HRCLDAS产品和观测资料为“真值”,与双线性插值的结果进行比较,特别是在山区等复杂地形中,CLDASSD输出的均方根误差(RMSE)可降低约0.1 ℃,其结构相似性(SSIM)大约高 0.2。CLDASSD 可以在空间分布方面估计更准确的细节纹理,在空间分布方面比双线性插值和其他子模型具有更高的精度,并且可以完成预期的降尺度任务。

    • Surface observation data are a critical component of the High-Resolution China Meteorological Administration's Land Assimilation System (HRCLDAS). At present, the number of national meteorological stations is more than 2400, and the number of regional automatic stations is approximately 60 000. To obtain high-resolution and high-quality assimilation products, it is necessary to assimilate large amounts of surface observation data. However, before 2008, due to the sparsity of meteorological stations in China, it was impossible to obtain surface observation data with high coverage. Taking the national reference stations and national basic stations as an example, the number of the stations was less than 100 in 1930, between 100 and 200 in the 1930s and the mid-1940s, dropping to less than 100 between the mid-1940s and early 1950s, before finally increasing to approximately 2400 in 2020. Since 2008, the China Meteorological Administration has set up more than 60 000 encrypted automatic stations. As noted above, the available observational data are insufficient to allow HRCLDAS to back-calculate the high-resolution assimilation data before 2008. We think deep statistical downscaling provides one potential method to accurately reconstruct high-resolution data, which could be used to fill in the product gaps in HRCLDAS, especially before 2008. Therefore, this research established a super-resolution deep learning model that attempts to generate high-resolution data. This work is applicable to high-resolution technology research in general as well as to potential future work that could back-calculate assimilation data and help fill in the data gaps caused by sparse meteorological stations prior to 2008.

      Our model uses super-resolution technology for reference because Statistical Downscaling (SD) in meteorology is similar to the Super-Resolution (SR) in computer vision. The former establishes a nonlinear mapping relationship between element fields in different scales (this nonlinear relationship is affected by various atmospheric physical factors); the latter involves recovering image degradation (such as blur and noise) because of the channel propagation process. The similarity between these methods is that they are intended to convert a mapping from coarse (low-resolution) images (fields) to fine (high-resolution) images (fields). In recent years, deep-learning-based, super-resolution technology has been booming, opening up new ideas for many statistical downscaling studies. For example, Vandal et al. (2017) stacked multiple super-resolution convolution neural networks (SRCNN, Dong et al., 2014) and proposed the first deep-learning-based downscaling model, deep statistical downscaling (DeepSD, Vandal et al., 2017), to perform precipitation downscaling tasks with different scaling factors. Mao (2019) modified DeepSD, added a residual structure, deepened the network depth, and proposed the very deep statistical downscaling model (VDSD). Cheng et al. (2020) proposed the numerical weather prediction multi-time super-resolution model (NWP-MTSR) to downscale the precipitation field. Singh et al. (2019) verified the portability of the generative adversarial network model to the enhanced super-resolution generative adversarial networks (ESRGAN, Wang et al., 2019) in a wind field downscaling task.

      The resolution of most previous field reconstructions was relatively coarse (approximately 0.05° × 0.05°), and the spatial geomorphic feature information was not detailed enough. Compared with the previous similar statistical downscaling works based on deep learning, our study aims to back-calculate the 0.01° × 0.01° resolution product from HRCLDAS. Therefore, we propose the China Meteorological Administration's Land Assimilation System Statistical Downscaling Model (CLDASSD) to downscale the 2-m temperature product (0.0625° × 0.0625°) from CLDAS and reconstruct higher-resolution (0.01° × 0.01°) temperature products.

      Before the experiment, we performed a statistical analysis of the paired data from CLDAS and HRCLDAS. Then, we explored the quality control methods of the data. In the model training and testing stage, through comparative experiments and ablation experiments, we took the data from HRCLDAS and surface observation data as “true values” to evaluate the performance of CLDASSD. Our experimental area is the Beijing-Tianjin-Hebei region which has complex and diverse geomorphology and is located at the heart of the Bohai Rim in Northeast Asia and China (35.5°–43.5°N, 112.5°–120.5°E). Figure 1 shows the geographical location and topography of the area, which contains various landforms, such as water bodies, mountains, and plains.

      Figure 1.  A map of the Beijing-Tianjin-Hebei region assembled from high-definition satellite maps provided by Google Earth. The red box delineates our test area, in which the plateau in the northwest, the mountains in the middle, the plains in the south, and the Bohai Bay in the east are visible.

      The remainder of this paper is organized as follows: Section 2 describes the datasets used in this study, section 3 describes the methodology of the downscaling technique, section 4 provides the results and associated discussion and, section 5 summarizes the research.

    2.   Datasets
    • The inputs of CLDASSD are a low-resolution temperature field and a high-resolution digital elevation model (DEM). The output is the corresponding high-resolution temperature field.

      The low-resolution, 2-m temperature data come from CLDAS-Version 2.0. The data are a grid fusion product covering the Asian region (0°–65°N, 60°–160°E) with a spatial resolution of 0.0625°, of equal latitude and longitude grids, and at a temporal resolution of one hour. This dataset uses multiple ground and satellite observation data sources and technologies, such as multigrid variational assimilation (STMAS), cumulative probability density function matching (CDF), physical inversion, terrain correction, and other technologies. The associated quality is better than that of similar international products in China (Han et al., 2020).

      The CLDASSD target temperature data come from HRCLDAS-Version 1.0, newly developed by the National Meteorological Information Center. The hourly grid fusion product set with a resolution of 0.01° includes surface pressure information, approximate 2-m temperature, 2-m specific humidity, precipitation, 10-m wind speed, and solar shortwave radiation (Han et al., 2019). The grid products of temperature, pressure, humidity, and wind speed are realized mainly by fusing European Centre for Medium-Range Weather Forecasts (ECMWF) analysis and forecast field data with observation data from more than 40 000 stations across the country through the STMAS method (Shi et al., 2019).

      The 0.01° DEM data over China are derived from 90-m-resolution terrain data from the National Aeronautics and Space Administration's shuttle radar topographic mission. At this resolution, most mountains, rivers, and plains can be well represented.

      In the selected experimental area, the numerical matrix shape of the 0.0625° × 0.0625° temperature field is 128 × 128, and the shapes of the 0.01° × 0.01° temperature field and 0.01° × 0.01° DEM data are both 800 × 800. Considering the total amount of data and to account for intra-day differences, we selected the paired data at UTC 0000, 0600, 1200, and 1800 from CLDAS and HRCLDAS to construct the experimental database.

      To make reasonable use of GPU resources, we used a sliding window to crop the samples to many small patches. Thus, in the paired patches, the area of the low-resolution patches is 16 × 16. The area of the high-resolution patches and DEM patches is 100 × 100. We use this size because if the patch size is too large, it will occupy excessive GPU memory, whereas one that is too small cannot capture the associated spatial information (Wang et al., 2020). We arrived at an acceptable patch size through many tests.

      The time range of the entire database is from January 2018 to December 2019. The data from 18 March to 19 February were used as the training set (we extracted 10% according to the time and season as the validation set). The remaining data were used as a test set to evaluate the performance of the model.

    • The 0.0625° scale and the 0.01° scale are of the same synoptic scale, so there should be no systematic error between these two scales of products. However, a specific Gaussian random error exists between them (there are many physical influence factors, with each influence factor having minor effects). The error can be obtained from the central limit theorem. Nevertheless, our experimental data come from different assimilation systems (low-resolution data from CLDAS, high-resolution data from HRCLDAS), and a small amount of paired data may have a systematic error.

      Taking the temperature product on 10 March 2018, as an example, Fig. 2 shows that at 0000, 0600, and 1200 UTC, the distributions between the high- and low-resolution fields are highly consistent, and the correlation coefficients (COR) between them are close to 1. Furthermore, the root mean square error (RMSE) is approximately 1°C, and the residual error obeys a normal distribution with a mean close to 0. In contrast, at 1800 UTC, the distribution differs considerably between the high- and low-resolution fields. The residuals show a skewed distribution, and the COR is lower than in the other samples. The RMSE is close to 3°C. This type of sample is considered to be poor-quality data and is to be eliminated before conducting experiments. The generation of such dirty data depends on the stability of HRCLDAS and the quality of the background field. Their data distribution is different from most samples. If we do not delete them, this will prevent the model from focusing on learning from those high-quality data distributions, thereby affecting the final performance of the model.

      Figure 2.  Graphical displays of the paired data from CLDAS (the spatial resolution is 0.0625° × 0.0625°) and HRCLDAS (0.01° × 0.01°) on 12 May 2018, as an example to show the abnormal samples that need to be eliminated in our experiments. The leftmost column shows the histogram comparison between the two resolution products. The middle column is the histogram of the residuals between them. The rightmost column is their scatter plot. (a), (b), (c), and (d) represent 0000, 0600, 1200, and 1800 UTC, respectively.

      Based on the above considerations, this article adopts the following quality control steps:

      (1) Climatic range check: According to the national standard, the limit value should be between –80°C and 60°C. Data outside this range are eliminated.

      (2) Data that lie outside the ±3σ confidence interval of the residual distribution between the high- and low-resolution data are eliminated.

      (3) Finally, the final verification selection is performed on all test samples by manual methods.

      The effect of the entire database after quality control is shown in Fig. 3. We found that a large part of the RMSE of the data before quality control was greater than 5°C, and the COR was concentrated mainly above 0.9. After quality control, outliers can be eliminated, and the RMSE and COR then lie within a reasonable range.

      Figure 3.  A violin plot of the entire data set before and after quality control. After our effective quality control, most of the outliers have been eliminated.

    3.   Methods
    • The backbone framework of our model uses VDSD (Mao, 2019). The very deep statistical downscaling model (VDSD) contains a large number of local skip connections. We have improved it, adding a global skip connection and attention mechanism, and call it the CLDAS Statistical Downscaling Model (CLDASSD). The input of CLDASSD is a coarse-resolution temperature field (0.0625° × 0.0625°) and fine-resolution DEM data (0.01° × 0.01°). After model training, the output is the temperature field magnified 6.25-fold (0.01° × 0.01°). CLDASSD mainly contains convolutional layers, pooling layers, rectified linear unit layers (ReLU), and sigmoid layers. The overall model design is shown in Fig. 4.

      Figure 4.  The design framework of CLDASSD. (a) shows the overall structure of the model; (b) shows the specific design details of the model's attention unit.

    • CLDASSD follows the pre-sampling structure of SRCNN (Dong et al., 2016), which locates the upsampling layer at the model's head. The upsampling method uses bicubic linear interpolation (Keys, 1981), within which we do not need to manually set any parameters for the models with different scale factors, which improves the model reusability. Furthermore, this approach can avoid the adverse effects of some learnable upsampling methods, such as a checkerboard effect from transposed convolution (Odena et al., 2016).

    • Suppose the model is $ \mathcal{F}\left(*\right) $, the input low-resolution temperature field denotes $ {T}_{\mathrm{c}\mathrm{o}\mathrm{a}\mathrm{r}\mathrm{s}\mathrm{e}} $, the high-resolution DEM denotes $ H $, and the output high-resolution temperature field denotes $ {T}_{\mathrm{f}\mathrm{i}\mathrm{n}\mathrm{e}} $; then,

      Here, $ x $ is the residual error between the high- and low-resolution fields following the Gaussian distribution. The model fits a sparse Gaussian distribution, which is much easier than directly learning the mapping from low-resolution fields to high-resolution fields (Drozdzal et al., 2016). This connection method is widely used in single-image super-resolution (Kim et al., 2016; Tai et al., 2017a, b; Ahn et al., 2018).

      Therefore, we directly add the low-resolution field after the upsampling layer to the end of the model in CLDASSD (see Fig. 4a), which avoids learning the mapping of the entire temperature field to the entire temperature field and dramatically reduces the difficulty of model learning.

    • Our motive for using the attention mechanism is to be able to anticipate that the model can effectively extract critical information and suppress useless information during the training process. The design inspiration of CLDASSD is influenced by a channel-wise and spatial feature modulation network (CSFM, Hu et al., 2020), so we add an attention unit at the end of the standard residual structure.

      The details of the attention unit are shown in Fig. 4b. We call the residual structure containing the attention unit ResAttentionBlock. The attention unit of each ResAttentionBlock contains two branches: a channel attention branch and a spatial attention branch. The spatial attention branch is used for pixel-level feature map processing: a two-dimensional weight vector can be obtained to suppress the low contribution area on each feature map. The channel attention branch is used to perform channel-level feature map processing: a one-dimensional weight vector can be obtained, and the feature map with a low contribution is directly assigned a lower weight. ResAttentionBlock then fuses the feature maps of the two branches after attention weight processing.

    • The formula for the vanilla L1 loss and its derivative is

      $ {\mathcal{L}}_{\mathrm{L}1} $ is not derivable at 0. However, when the model fits the residual mentioned in section 3.2.1, there will be many zero values in the residual spatial distribution, making the model's output unstable. Therefore, we use Charbonnier loss (Lai et al., 2017), a kind of improved vanilla L1 loss, to serve as the model's loss function. The formula is

      where $ {O}_{i} $ denotes the model output grid i, $ {G}_{i} $ denotes the HRCLDAS data grid i, and $ n $ denotes the number of grids. $ \epsilon $ is a constant, generally set to $ {10}^{-3} $. This improvement makes the loss function derivable everywhere, and the model training process is more stable than with the vanilla L1 loss.

    • CLDASSD is built on TensorFlow 1.4, and the entire model is trained on one NVIDIA 2080ti GPU. All convolution structures in the model use a convolution kernel with a size of 3 × 3 and a step size of 1 and perform Gaussian initialization; padding technology is used to ensure that all feature maps in the data stream maintain their shape. The number of ResAttentionBlocks, mx, means that there are m blocks (see Fig. 4a). In these paper m is 9, and the training batch size is 64 (these parameters are adjusted repeatedly). The Adam optimizer is used, and the learning rate is set to 0.001 to optimize the network model's parameters.

    • To evaluate the results of CLDASSD in detail and comprehensively, we design the "double true values" evaluation. Specifically, first, we use the observation data as the "true value" and take bias, root mean square error (RMSE), mean absolute error (MAE), and COR as metrics to evaluate the reconstruction field of the model. Moreover, we also care about the spatial distribution of the model’s reconstruction field and the accuracy of the texture details at high resolution. Therefore, we take HRCLDAS as the “true value” and use the peak signal-to-noise ratio (PSNR) and structural similarity (SSIM), which are often used in super-resolution, to evaluate the similarity between the reconstruction field and HRCLDAS.

      Bias, RMSE, MAE, and COR are used mainly to evaluate the reconstruction field's pixel-level error. Their formulas are as follows:

      where $ {O}_{i} $ denotes the observations at weather station $ i $ (i.e., the true value), $ {G}_{i} $ denotes the reconstruction field interpolated to the corresponding station $ i $, and $ N $ denotes the number of stations.

      For spatial distribution evaluation of the model downscaling results, we use PSNR and SSIM (Wang et al., 2004). The formulas are as follows:

      $ {I}_{\mathrm{m}\mathrm{a}\mathrm{x}} $ refers to the bit depth of the data, and the value of $ {I}_{\mathrm{m}\mathrm{a}\mathrm{x}} $ is 255 for the natural image uint8 data type. Therefore, when calculating the PSNR of the temperature field, we convert the data range to (0, 255) so that $ {I}_{\mathrm{m}\mathrm{a}\mathrm{x}} $ can be calculated as 255.

      In Eq. (10), $ {\mu }_{G} $ is the mean value of the reconstruction field, $ {\mu }_{O} $ is the mean value of the real high-resolution field (i.e., the second “true value,” the data from HRCLDAS), $ {\mathrm{\sigma }}_{\mathrm{G}} $ is the standard deviation of the reconstruction field, $ {\sigma }_{O} $ is the standard deviation of the real high-resolution field, and $ {\sigma }_{GO} $ is the covariance between the reconstruction field and the real high-resolution field.

      To compare with the results of CLDASSD, we use bilinear interpolation as the baseline for comparison experiments. The formula for bilinear interpolation is as follows:

      $ Z\left({I}_{1},{J}_{1}\right) $, $ Z({I}_{1},{J}_{2}) $, $ Z\left({I}_{2},{J}_{1}\right) $, and $ Z({I}_{2},{J}_{2}) $ are the variable values on the grid; $ Z\left({I}_{1},J\right)\;\mathrm{a}\mathrm{n}\mathrm{d}\;Z\left({I}_{2},J\right) $ are the interpolation results obtained after linear interpolation on the latitudes $ {I}_{1} $ and $ {I}_{2} $, respectively; and $ Z\left(I,J\right) $ is the value at a specific position after interpolation.

      To illustrate the contribution of the attention mechanism and global skip connection proposed in this paper, we conduct a series of ablation experiments (sub-models) as shown in Table 1.

      ModelStructure
      CLDASSD_wThe entire network has only a simple stack of residual structures.
      CLDASSD_aBased on the residual structure, we added the attention unit we designed.
      CLDASSD_gBased on CLDASSD_w, a global skip connection is added.
      CLDASSDThis model has both a global skip connection and an attention mechanism.

      Table 1.  Design framework of several ablation experiments (sub-models). It is used to evaluate the contribution of global skip connection and attention mechanism.

    4.   Results and discussion
    • In the following results and discussion, it is assumed that the evaluation metrics results are all in the test set if no specific explanation is given. Except for PSNR and SSIM, we take the products from HRCLDAS as the “true value”; other metrics take site observation data as the “true value.”

      First, the average of the evaluation metrics on the test set is given in Table 2. Compared with bilinear interpolation, our model provides an improvement, especially the improvement of SSIM by approximately 0.2. The apparent improvement of the visual evaluation metric, SSIM, can preliminarily illustrate the great potential of CLDASSD in estimating the structural features of high-resolution temperature fields.

      MetricsBilinearCLDASSD_wCLDASSD_aCLDASSD_gCLDASSD
      RMSE1.371.341.341.311.30
      MAE0.970.950.950.940.93
      PSNR29.9030.6830.8131.0631.21
      SSIM0.350.570.590.590.60

      Table 2.  The average statistics for each metric in the test set (bold stands for the best). It can be seen that there is an improvement in SSIM.

      To explore the model’s ability to reconstruct the temperature field under different types of terrain, we divide the DEM maps of the research area into an 8 × 8 chessboard, i.e., 64 small, uniform patches; the ID of each small patch is shown in Fig. 5. Then, we classify each patch into one of four terrain types (plain, water body, mountain, and plateau). Table 3 shows the specific content of each terrain set. Table 4 shows the evaluation results under the four terrain types. Compared with bilinear interpolation, CLDASSD provides a more substantial improvement in mountain areas, where the RMSE can be reduced by approximately 0.1°C, and a small improvement in plain, water body, and plateau areas. These results show that CLDASSD is particularly outstanding in reconstructing areas with complex terrain gradients. The remainder of this section will combine daily and seasonal change to evaluate the performance of CLDASSD.

      Figure 5.  DEM map of the Beijing-Tianjin-Hebei region. We divided it into an 8 × 8 chessboard and assigned an ID to each patch.

      Terrain typeSet
      Mountain{5, 12, 13, 18, 19, 20, 21, 22, 25, 26, 27, 32, 33, 34, 40, 41, 48, 49, 56, 57, 60, 61}
      Plain{6, 7, 14, 15, 23, 28, 29, 30, 35, 36, 42, 43, 44, 50, 51, 52, 53, 55, 58, 59}
      Water body{31, 37, 38, 39, 45, 46, 47, 54, 62, 63}
      Plateau{0, 1, 2, 3, 4, 8, 9, 10, 11, 16, 17, 24}

      Table 3.  Classification of all the patches in Fig. 5. We divided them into four types, i.e., mountain, plain, water body, and plateau.

      MetricsTypeBilinearCLDASSD_wCLDASSD_aCLDASSD_gCLDASSD
      RMSEPlain1.011.011.011.00.99
      Water body1.191.181.191.191.17
      Mountain1.611.551.551.511.50
      Plateau1.351.331.331.331.32
      MAEPlain0.730.730.730.730.72
      Water body0.870.860.860.870.85
      Mountain1.161.131.131.11.11
      Plateau1.00.990.990.990.98

      Table 4.  The reconstruction field for each time was divided into four different terrain types mentioned in Table 3 for evaluation. Although the RMSE is the highest in the mountainous area, it has the biggest improvement compared with that in bilinear interpolation (bold stands for the best).

    • We evaluate the model according to different daily times (we use UTC times), and Fig. 6 shows the evaluation results. Figure 6a shows that the RMSE at 0600 UTC is the lowest, reduced by 0.13°C compared with that in bilinear interpolation. It is noteworthy that 0600 UTC occurs at noon in the local area when the spatial distribution of temperature has the strongest correlation with terrain elevation. As an auxiliary element, the terrain elevation data can provide more detailed information for the model at that time.

      Figure 6.  These four figures show the evaluation results of different metrics of different models in daily times (All times are coordinated universal time, UTC). (a), (b), (c), and (d) represent RMSE, MAE, PSNR and SSIM, respectively. CLDASSD performs best at 0600 UTC, i.e., noon in the local area.

      To evaluate the ability of CLDASSD to capture the spatial features of daily change in the temperature field, we use four daily times on 15 April 2019, as an example. Specific spatial details of the reconstruction are shown in Fig. 7. It can be seen that the spatial distribution of our model output is more similar to ground truth than that produced by bilinear interpolation (our models include CLDASSD and all sub-models). Specifically, our models have obvious advantages in the fine-scale reconstruction, noting that the output of bilinear interpolation is not detailed due to its averaging scheme. Our models also perform better in terms of metrics than bilinear interpolation in RMSE and SSIM; overall, CLDASSD is superior.

      Figure 7.  Analysis from 15 April 2019 used as an example to show the daily change of the spatial distribution of the temperature field in a fixed mountainous area (the area in the black box). (a), (b), (c), and (d) denote 0000, 0600, 1200, and 1800 UTC, respectively.

      In summary, regardless of the visualization of spatial distribution or evaluation metrics, the reconstruction field of CLDASSD at each daily time is close to the “double true values,” which shows that CLDASSD has performance robustness regarding daily changes.

    • In this section, our models are re-evaluated according to season. The evaluation metrics results are shown in Fig. 8. Based on bilinear interpolation, CLDASSD has a similar improvement of RMSE for each season, with an average of approximately 0.07°C. Among the four seasons, the lowest RMSE is observed in summer, likely because the plains area is greatly affected by the summer monsoon. However, the plateau and mountainous areas are less affected by the summer monsoon. Thus, the temperature difference is most affected by terrain, and CLDASSD can make full use of the terrain’s data to reconstruct fine-scale details that are not observable in the coarse-scale temperature field. However, in winter (DJF), we found that the model fares much worse than in summer. The main reasons which explain this, center around the facts that the latent and sensible heat fluxes in winter are not as strong as in summer, and the spatial distribution of temperature is less affected by topography compared to summer. The auxiliary information input of our model only adds a DEM, resulting in poor results in winter. We also mention in section 4.3 that we will consider adding more factors that affect the spatial distribution of temperature in future work.

      Figure 8.  These four figures show the evaluation results of different metrics of different models in all seasons. (a), (b), (c), and (d) represent RMSE, MAE, PSNR, and SSIM, respectively. CLDASSD performs best in summer.

      We select the seasonal representative day (the first of the month) at 0600 UTC, and the outputs of CLDASSD and bilinear interpolation are compared, as shown in Fig. 9. It can be found that in mountain and plateau areas, CLDASSD can estimate several subtle textures more accurately than bilinear interpolation. However, CLDASSD cannot evaluate small disturbances in the plains area, as can HRCLDAS products. Ground truth, bilinear interpolation, and CLDASSD are consistent in the water body, reflecting an insufficient improvement in water body representation.

      Figure 9.  Analysis from the first day in each representative month of four seasons at 0600 UTC as an example to compare the seasonal change in reconstruction fields of bilinear interpolation and CLDASSD. The leftmost column is the product of HRCLDAS, the middle column is the output of bilinear interpolation, and the rightmost column is the output of CLDASSD. (a), (b), (c), and (d) denote the first day in January, April, July, October, respectively.

      Reconstructing the subtle disturbances in the areas of the plains are not the primary task of our experiments. Such disturbances may arise from various physical processes. Moreover, for the plains area, the spatial gradient of the temperature field is not large, so ordinary interpolation methods can also reconstruct the fine-scale temperature field with very small error. However, temperature reconstruction for complex terrain requires the precise spatial distribution of temperature. Moreover, CLDASSD can estimate the fine texture of the temperature field, similar to HRCLDAS products, under complex terrain gradients. Moreover, from seasonal change, CLDASSD can reconstruct detailed textures in complex mountainous areas in every season. This robustness is similar to the robustness in daily change.

    • We have discussed the evaluation results according to daily times, seasons, and space. In terms of evaluation metrics and spatial distribution, CLDASSD has a lower RMSE than bilinear interpolation and can present finer-scale details.

      To further compare the output quality of CLDASSD and bilinear interpolation on the test set, it can be seen in Fig. 10 that the bias of bilinear interpolation is basically between –0.1°C and –0.2°C. Furthermore, CLDASSD has an improvement of approximately 0.1°C. In addition, the bias in summer is stable and close to zero, which echoes the previous analysis. We then determine the COR and RMSE frequency between the outputs of CLDASSD and bilinear interpolation, as shown in Fig. 11. The reconstructed field, with a COR greater than 0.98 and RMSE greater than 0.75°C, demonstrates an improvement.

      Figure 10.  The line chart of bias compares the bias of bilinear interpolation and CLDASSD on the test set. Because a different amount of data is discarded by quality control each month, the axis scale is not uniform.

      Figure 11.  The frequency of COR and RMSE for bilinear interpolation and CLDASSD. It is clear that the output quality of CLDASSD is better than that of bilinear interpolation.

      However, CLDASSD still has shortcomings. As discussed in section 4.2, CLDASSD performs only slightly better than bilinear interpolation in plains areas, but some subtle disturbances were absent or not well-simulated. This shortcoming is due to the small terrain gradient in plains areas, whereby the spatial distribution of temperature is less affected by the terrain. Therefore, in future work, we will select the underlying surface elements that influence the small disturbances in the plain areas, such as land cover, land utilization, ground incident solar radiation, and ground surface albedo. In addition, there is almost no improvement in the water area. This shortcoming may be due to the direct use of the same background field between CLDAS and HRCLDAS, precluding us from obtaining better results.

      In addition to the above analysis of the advantages and disadvantages of the model results, we want to re-emphasize the advantages of the downscaling method based on deep learning in engineering. First, the deep learning model is a powerful feature extractor, which saves us from complicated feature engineering (e.g., accurately selecting sensitive factors related to temperature) during the data preparation phase. Second, many physical parameters need to be adjusted before running the test in a physical model, but the deep learning model only needs to adjust a few parameters that are independent of physics. Finally, the biggest advantage of the deep learning model in downscaling tasks is that it saves considerable amounts of computing power (Reichstein et al., 2019). It only needs a few GPUs to complete our needs, which is especially useful when the research area is larger and the resolution is higher, because the time complexity of the physical model will increase exponentially.

      Finally, the downscaling task involves complex physical processes, and our experimental results still have room for improvement. We also need to have a deeper understanding of the use of deep learning for downscaling, such as designing more suitable models for downscaling tasks.

    5.   Conclusion
    • To fill the vacancy of assimilation data from HRCLDAS caused by the lack of surface observation data before 2008, we have designed a model based on deep learning for a preliminary study of this problem.

      First, we propose an effective quality control method to control the quality of the paired data from CLDAS and HRCLDAS within a time range from 2018 to 2019. Then, informed by the deep-learning-based super-resolution algorithm, we propose a new temperature field downscaling model, named CLDASSD. The design of this model considers the attention mechanism and the global skip connection. To test the downscaling ability of CLDASSD in a variety of terrain areas, we take the Beijing-Tianjin-Hebei region with multiple landforms as the research area. Moreover, we use the experimental data from 31 March 2018 to 28 February 2019, to train the model, and the remaining data are used for testing. Finally, the performance of CLDASSD is evaluated by the “double true values” scheme from the perspectives of daily change, seasonal change, and spatial distribution.

      The comparison experiment results and the ablation experiments show that CLDASSD is far superior to bilinear interpolation in evaluation metrics and spatial distribution. CLDASSD can estimate finer-scale temperature field details, which are close to HRCLDAS 2-m temperature products. The performance of CLDASSD is superior, especially regarding complex terrain, because CLDASSD can effectively make use the terrain data, which can provide considerable information for downscaling the temperature field.

      Our research reveals the effectiveness of using the deep learning-based super-resolution algorithm for temperature field downscaling tasks. As noted in the introduction, our work is a preliminary study, so CLDASSD has not been evaluated on historical data (before 2008). In future work, we will continue to improve CLDASSD and back-calculate the data prior to 2008.

      Acknowledgements. We would like to thank the National Key Research and Development Program of China (Grant No. 2018YFC1506604) and the National Natural Science Foundation of China (Grant No. 91437220) who supported this research.

Reference

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return