Advanced Search
Article Contents

Identification of Convective and Stratiform Clouds Based on the Improved DBSCAN Clustering Algorithm


doi: 10.1007/s00376-021-1223-7

  • A convective and stratiform cloud classification method for weather radar is proposed based on the density-based spatial clustering of applications with noise (DBSCAN) algorithm. To identify convective and stratiform clouds in different developmental phases, two-dimensional (2D) and three-dimensional (3D) models are proposed by applying reflectivity factors at 0.5° and at 0.5°, 1.5°, and 2.4° elevation angles, respectively. According to the thresholds of the algorithm, which include echo intensity, the echo top height of 35 dBZ (ET), density threshold, and ε neighborhood, cloud clusters can be marked into four types: deep-convective cloud (DCC), shallow-convective cloud (SCC), hybrid convective-stratiform cloud (HCS), and stratiform cloud (SFC) types. Each cloud cluster type is further identified as a core area and boundary area, which can provide more abundant cloud structure information. The algorithm is verified using the volume scan data observed with new-generation S-band weather radars in Nanjing, Xuzhou, and Qingdao. The results show that cloud clusters can be intuitively identified as core and boundary points, which change in area continuously during the process of convective evolution, by the improved DBSCAN algorithm. Therefore, the occurrence and disappearance of convective weather can be estimated in advance by observing the changes of the classification. Because density thresholds are different and multiple elevations are utilized in the 3D model, the identified echo types and areas are dissimilar between the 2D and 3D models. The 3D model identifies larger convective and stratiform clouds than the 2D model. However, the developing convective clouds of small areas at lower heights cannot be identified with the 3D model because they are covered by thick stratiform clouds. In addition, the 3D model can avoid the influence of the melting layer and better suggest convective clouds in the developmental stage.
    摘要: 基于DBSCAN聚类算法,本文提出一种用于天气雷达对流云和层状云识别的方法。为识别不同发展阶段的对流云和层状云,分别以0.5°一层仰角和0.5°、1.5°、2.4°共三层仰角的反射率因子数据构建二维模型和三维模型。根据设定的回波强度阈值、反射率因子等于35dBZ的回波顶高、密度阈值和邻域可将云团划分为深对流云、浅对流云、积层混合云和层状云四种不同的类型。每种云类又进一步识别出核心区和边界区,以提供更丰富的云结构信息。利用南京、徐州和青岛新一代S波段天气雷达观测的体扫数据对该算法进行验证,结果表明:DBSCAN算法能够识别出每种云团的核心区与边界区,分类结果直观,在对流加深过程中,随时间变化,能明显发现云团边界逐渐发展为云团核心,从不同体扫时间的识别结果可以看到一些云团的新生和消亡;由于三维模型使用多个仰角层,而且是分距离段设置密度阈值,所以造成了两个模型识别出的各类型云团的回波区域存在差异;三维模型较二维模型识别出更大面积的对流云和层状云,但不足之处是三维模型中高层较深厚的层状云可能会掩盖掉低层发展较弱、面积很小的对流云。此外,三维模型可以避免融化层的影响,更好地显示发展阶段的对流云。
  • 加载中
  • Figure 1.  Schematic diagram of the DBSCAN algorithm.

    Figure 2.  Flow chart of cloud cluster identification based on improved DBSCAN algorithm

    Figure 3.  A schematic of the identification process with the improved DBSCAN algorithm in which A and B are identified as the core points, C is defined as a boundary point, and N is defined as a noisy point.

    Figure 4.  The reflectivity PPI at 0.5°, 1.5°, and 2.4° observed with the Nanjing SA radar at 1000 UTC on 28 April 2015 (a–c), and the results of the cloud classification performed with the 2D model (d) and 3D model (e) in which the symbol _c represents the core area, and _b represents the bound area of each type of cloud cluster. The black plus signs are the locations of lightning, the black and red lines are the locations of sections corresponding to Fig. 5, and each circle is 50 km apart (same below). To compare the recognition effect, classification PPIs of the convective cloud (red area) and stratiform cloud (blue area) are attached to (f), as identified by Xiao et al. (2007) according to the fuzzy logic method.

    Figure 5.  The reflectivity of the vertical sections corresponding to the black line (a) and the red line (b) in Fig. 4 (Range is horizontal distance, height is echo height, and the black dotted line is the 3.8 km height of the 0ºC layer in Nanjing at 1200 UTC on 28 April 2015).

    Figure 6.  The number of points of each cloud type identified with the 2D and 3D models.

    Figure 7.  Similar to Fig. 4, but the radar data were observed in Xuzhou at 0754 UTC 6 July 2019; the black line is the location of the section corresponding to Fig. 8.

    Figure 8.  The reflectivity vertical section corresponds to the black line in Fig. 7a, and the black dotted line is the 4.5 km height of the 0ºC layer in Xuzhou at 1200 UTC 6 July 2019.

    Figure 9.  The changing trend of the number of cloud clusters identified by the 3D model; (a) 0940–1904 UTC and (b) 1128–1220 UTC (scale of the point number of the SFC boundary is on the right side of the Y-axis, and the other seven types of clouds are on the left).

    Figure 10.  The reflectivity image of the 0.5° PPI observed with the Qingdao SA radar at 1128 UTC, 1146 UTC, 1203 UTC, and 1220 UTC on 17 May 2020 (a–d), and the results of the cloud classification performed with the 3D model (e–f).

    Table 1.  Point number of each cloud type obtained under different density thresholds.

    Threshold of densityCloud type
    34567
    Core of DCC953812733725699
    Boundary of DCC598566508496444
    Core of SCC812779797795664
    Boundary of SCC1491135612521096914
    Core of HCS23552100199220411935
    Boundary of HCS25052233206718371524
    Core of SFC2246721560205431908515921
    Boundary of SFC61796276642465416906
    DownLoad: CSV

    Table 2.  Point number of each cloud type in different ε neighborhoods.

    ε neighborhoodsCloud type
    1234
    Core of DCC73385710151128
    Boundary of DCC508109716952362
    Core of SCC797413278234
    Boundary of SCC1252213227053440
    Core of HCS1992202018741667
    Boundary of HCS2067350345845482
    Core of SFC20 54322 59222 43522 277
    Boundary of SFC642494761216113877
    DownLoad: CSV
  • Austin, P. M., and R. A. Houze, 1972: Analysis of the structure of precipitation patterns in New England. J. Appl. Meteorol. Climatol., 11(6), 926−935, https://doi.org/10.1175/1520-0450(1972)011<0926:AOTSOP>2.0.CO;2.
    Bao, M., 2007: The statistical analysis of the persistent heavy rain in the last 50 years over China and their backgrounds on the large scale circulation. Chinese Journal of Atmospheric Sciences, 31(5), 779−792, https://doi.org/10.3878/j.issn.1006-9895.2007.05.03. (in Chinese with English abstract
    Behrang, M. A., E. Assareh, A. Ghanbarzadeh, and A. R. Noghrehabadi, 2010: The potential of different artificial neural network (ANN) techniques in daily global solar radiation modeling based on meteorological data. Solar Energy, 84(8), 1468−1480, https://doi.org/10.1016/j.solener.2010.05.009.
    Biggerstaff, M. I., and S. A. Listemaa, 2000: An improved scheme for convective/stratiform echo classification using radar reflectivity. J. Appl. Meteorol. Climatol., 39(12), 2129−2150, https://doi.org/10.1175/1520-0450(2001)040<2129:AISFCS>2.0.CO;2.
    Chen, G. Y., X. X. Ding, and L. Y. Zhao, 2005: An automatical pattern recognition techniques of cloud based on fuzzy neural network. Chinese Journal of Atmospheric Sciences, 29(5), 837−844, https://doi.org/10.3878/j.issn.1006-9895.2005.05.16. (in Chinese with English abstract
    Churchill, D. D., and R. A. Houze, 1984: Development and structure of winter monsoon cloud clusters on 10 December 1978. J. Atmos. Sci., 41, 933−960, https://doi.org/10.1175/1520-0469(1984)041<0933:dasowm>2.0.co;2.
    De Mott, C. A., R. Cifelli, and S. A. Rutledge, 1995: Improved method for partitioning radar data into convective and stratiform components. Preprints, 27th Conf. on Radar Meteorology, Vail, CO, Amer. Meteor. Soc., 233--236.
    Feng, S. R., and W. J. Xiao, 2008: An improved DBSCAN clustering algorithm. Journal of China University of Mining & Technology, 37(1), 105−111, https://doi.org/10.3321/j.issn:1000-1964.2008.01.021. (in Chinese with English abstract
    Ho, H. C., A. Knudby, P. Sirovyak, Y. M. Xu, M. Hodul, and S. B. Henderson, 2014: Mapping maximum urban air temperature on hot summer days. Remote Sensing of Environment, 154, 38−45, https://doi.org/10.1016/j.rse.2014.08.012.
    Houze, R. A. Jr., 1973: A climatological study of vertical transports by cumulus-scale convection. J. Atmos. Sci., 30, 1112−1123, https://doi.org/10.1175/1520-0469(1973)030<1112:acsovt>2.0.co;2.
    Kusiak, A., H. Y. Zheng, and Z. Song, 2009: Wind farm power prediction: A data-mining approach. Wind Energy, 12(3), 275−293, https://doi.org/10.1002/we.295.
    Li, F., L. P. Liu, H. Y. Wang, and Y. Jiang, 2012: Identification of non-precipitation meteorological echoes with Doppler weather radar. Journal of Applied Meteorological Science, 23(2), 147−158, https://doi.org/10.3969/j.issn.1001-7313.2012.02.003. (in Chinese with English abstract
    Li, Y. Y., R. C. Yu, Y. P. Xu, and X. H. Zhang, 2003: The formation and diurnal changes of stratiform clouds in southern china. Acta Meteorologica Sinica, 61(6), 733−743, https://doi.org/10.3321/j.issn:0577-6619.2003.06.010. (in Chinese with English abstract
    Liu, W., Y. L. Wang, and Y. J. Zhao, 2004: Application of NOAA AVHRR data to identification of precipitation area. Meteorological Monthly, 30(2), 3−9, https://doi.org/10.3969/j.issn.1000-0526.2004.02.001. (in Chinese with English abstract
    Liu, Y., J. Xia, C. X. Shi, and Y. Hong, 2009: An improved cloud classification algorithm for China's FY-2C multi-channel images using artificial neural network. Sensors, 9(7), 5558−5579, https://doi.org/10.3390/s90705558.
    Mellit, A., A. M. Pavan, and M. Benghanem, 2013: Least squares support vector machine for short-term prediction of meteorological time series. Theor. Appl. Climatol., 111(1−2), 297−307, https://doi.org/10.1007/s00704-012-0661-7.
    Pérez, J. C., A. Cerdeña, A. González, and M. Armas, 2009: Nighttime cloud properties retrieval using MODIS and artificial neural networks. Advances in Space Research, 43(5), 852−858, https://doi.org/10.1016/j.asr.2008.06.013.
    Steiner, M., R. A. Houze Jr., and S. E. Yuter, 1995: Climatological characterization of three-dimensional storm structure from operational radar and rain gauge data. J. Appl. Meteorol. Climatol., 34(9), 1978−2007, https://doi.org/10.1175/1520-0450(1995)034<1978:CCOTDS>2.0.CO;2.
    Voyant, C., G. Notton, S. Kalogirou, M. L. Nivet, C. Paoli, F. Motte, and A. Fouilloy, 2017: Machine learning methods for solar radiation forecasting: A review. Renewable Energy, 105, 569−582, https://doi.org/10.1016/j.renene.2016.12.095.
    Wen, H., L. P. Liu, and Y. Zhang, 2017: Improvements of ground clutter identification algorithm for Doppler weather radar. Plateau Meteorology, 36(3), 736−749, https://doi.org/10.7522/j.issn.1000-0534.2016.00063. (in Chinese with English abstract
    Xiao, Y. J., and L. P. Liu, 2007: Identification of stratiform and convective cloud using 3D radar reflectivity data. Chinese Journal of Atmospheric Sciences, 31(4), 645−654, https://doi.org/10.3878/j.issn.1006-9895.2007.04.09. (in Chinese with English abstract
    Zhang, J. J., X. Y. Zhao, and Y. Huang, 2010: Review of monitoring methods for rainfall cloud remote sensing. Meteorological Science and Technology, 38(5), 588−593, https://doi.org/10.3969/j.issn.1671-6345.2010.05.010. (in Chinese with English abstract
    Zhao, Y., and Y. F. Qian, 2008: Characteristics of the severe rain and its relation to flood and drought in the Changjiang and Huaihe areas in summer. Journal of Nanjing University (Natural Sciences), 44(3), 237−249, https://doi.org/10.3321/j.issn:0469-5097.2008.03.002. (in Chinese with English abstract
    Zhong, L. Z., L. P. Liu, and S. S. Gu, 2007: A algorithm identifying convective and strariform in mixed precipitation and its application to estimating precipitation. Plateau Meteorology, 26(3), 593−602, https://doi.org/10.3321/j.issn:1000-0534.2007.03.022. (in Chinese with English abstract
    Zhou, S. D., W. K. Zhou, H. G. Zhu, C. X. Wang, and Y. Wang, 2010: Impact of climate change on agriculture and its countermeasures. Journal of Nanjing Agricultural University (Social Sciences Edition), 10(1), 34−39, https://doi.org/10.3969/j.issn.1671-7465.2010.01.006. (in Chinese with English abstract
  • [1] Sung-Ho SUH, Eun-Ho CHOI, Hong-Il KIM, Woonseon JUNG, 2023: Possibility of Solid Hydrometeor Growth Zone Identification Using Radar Spectrum Width, ADVANCES IN ATMOSPHERIC SCIENCES, 40, 317-332.  doi: 10.1007/s00376-022-1472-0
    [2] Yang LI, Yubao LIU, Rongfu SUN, Fengxia GUO, Xiaofeng XU, Haixiang XU, 2023: Convective Storm VIL and Lightning Nowcasting Using Satellite and Weather Radar Measurements Based on Multi-Task Learning Models, ADVANCES IN ATMOSPHERIC SCIENCES, 40, 887-899.  doi: 10.1007/s00376-022-2082-6
    [3] LIN Yinjing, WANG Hongqing, HAN Lei, ZHENG Yongguang, WANG Yu, 2010: Quantitative Analysis of Meso-β-scale Convective Cells and Anvil Clouds over North China, ADVANCES IN ATMOSPHERIC SCIENCES, 27, 1089-1098.  doi: 10.1007/s00376-010-9154-8
    [4] Liu Shida, Xin Guojun, Liu Shikuo, Liang Fuming, 2000: The 3D Spiral Structure Pattern in the Atmosphere, ADVANCES IN ATMOSPHERIC SCIENCES, 17, 519-524.  doi: 10.1007/s00376-000-0015-8
    [5] Wenshou TIAN, GUO Zhenhai, YU Rucong, 2004: Treatment of LBCs in 2D Simulation of Convection over Hills, ADVANCES IN ATMOSPHERIC SCIENCES, 21, 573-586.  doi: 10.1007/BF02915725
    [6] LIU Yongming, CAI Jingjing, 2006: On Nonlinear Stability Theorems of 3D Quasi-geostrophic Flow, ADVANCES IN ATMOSPHERIC SCIENCES, 23, 809-814.  doi: 10.1007/s00376-006-0809-4
    [7] Luyao QIN, Yaodeng CHEN, Gang MA, Fuzhong WENG, Deming MENG, Peng ZHANG, 2023: Assimilation of FY-3D MWTS-II Radiance with 3D Precipitation Detection and the Impacts on Typhoon Forecasts, ADVANCES IN ATMOSPHERIC SCIENCES, 40, 900-919.  doi: 10.1007/s00376-022-1252-x
    [8] Qingchang QIN, Xueshun SHEN, Chungang CHEN, Feng XIAO, Yongjiu DAI, Xingliang LI, 2019: A 3D Nonhydrostatic Compressible Atmospheric Dynamic Core by Multi-moment Constrained Finite Volume Method, ADVANCES IN ATMOSPHERIC SCIENCES, 36, 1129-1142.  doi: 10.1007/s00376-019-9002-4
    [9] Fabien CARMINATI, Nigel ATKINSON, Brett CANDY, Qifeng LU, 2021: Insights into the Microwave Instruments Onboard the Fengyun 3D Satellite: Data Quality and Assimilation in the Met Office NWP System, ADVANCES IN ATMOSPHERIC SCIENCES, 38, 1379-1396.  doi: 10.1007/s00376-020-0010-1
    [10] Jeong-Eun LEE, Sung-Hwa JUNG, Hong-Mok PARK, Soohyun KWON, Pay-Liam LIN, GyuWon LEE, 2015: Classification of Precipitation Types Using Fall Velocity-Diameter Relationships from 2D-Video Distrometer Measurements, ADVANCES IN ATMOSPHERIC SCIENCES, 32, 1277-1290.  doi: 10.1007/s00376-015-4234-4
    [11] Jiang Weimei, Wang Xuemei, 1996: A 2-D Non-local Closure Model for Atmospheric Boundary Layer Simulations, ADVANCES IN ATMOSPHERIC SCIENCES, 13, 169-182.  doi: 10.1007/BF02656860
    [12] JIANG Zhihong, DING Yuguo, ZHENG Chunyu, CHEN Weilin, 2011: An Improved, Downscaled, Fine Model for Simulation of Daily Weather States, ADVANCES IN ATMOSPHERIC SCIENCES, 28, 1357-1366.  doi: 10.1007/s00376-011-0086-8
    [13] Juan HUO, Yongheng BI, Daren Lü, Shu DUAN, 2019: Cloud Classification and Distribution of Cloud Types in Beijing Using Ka-Band Radar Data, ADVANCES IN ATMOSPHERIC SCIENCES, , 793-803.  doi: 10.1007/s00376-019-8272-1
    [14] Ling YANG, Yun WANG, Zhongke WANG, Qian YANG, Xingang FAN, Fa TAO, Xiaoqiong ZHEN, Zhipeng YANG, 2020: Automatic Identification of Clear-Air Echoes Based on Millimeter-wave Cloud Radar Measurements, ADVANCES IN ATMOSPHERIC SCIENCES, 37, 912-924.  doi: 10.1007/s00376-020-9270-z
    [15] Chong WU, Liping LIU, Ming WEI, Baozhu XI, Minghui YU, 2018: Statistics-based Optimization of the Polarimetric Radar Hydrometeor Classification Algorithm and Its Application for a Squall Line in South China, ADVANCES IN ATMOSPHERIC SCIENCES, 35, 296-316.  doi: 10.1007/s00376-017-6241-0
    [16] Bo-Young YE, GyuWon LEE, Hong-Mok PARK, 2015: Identification and Removal of Non-meteorological Echoes in Dual-polarization Radar Data Based on a Fuzzy Logic Algorithm, ADVANCES IN ATMOSPHERIC SCIENCES, 32, 1217-1230.  doi: 10.1007/s00376-015-4092-0
    [17] Yunfei Fu, Yang Liu, Peng Zhang, Songyan Gu, Lin Chen, Sun Nan, 2024: A New Algorithm of Rain Type Classification for GPM Dual-Frequency Precipitation Radar in Summer Tibetan Plateau, ADVANCES IN ATMOSPHERIC SCIENCES.  doi: 10.1007/s00376-024-3384-7
    [18] Chuan GAO, Rong-Hua ZHANG, Xinrong WU, Jichang SUN, 2018: Idealized Experiments for Optimizing Model Parameters Using a 4D-Variational Method in an Intermediate Coupled Model of ENSO, ADVANCES IN ATMOSPHERIC SCIENCES, 35, 410-422.  doi: 10.1007/s00376-017-7109-z
    [19] LAN Weiren, HUANG Sixun, XIANG Jie, 2004: Generalized Method of Variational Analysis for 3-D Flow, ADVANCES IN ATMOSPHERIC SCIENCES, 21, 730-740.  doi: 10.1007/BF02916370
    [20] JIN Ling, Fanyou KONG, LEI Hengchi*, and HU Zhaoxia, 2014: A Methodological Study on Using Weather Research and Forecasting (WRF) Model Outputs to Drive a One-Dimensional Cloud Model, ADVANCES IN ATMOSPHERIC SCIENCES, 31, 230-240.  doi: 10.1007/s00376-013-2257-2

Get Citation+

Export:  

Share Article

Manuscript History

Manuscript received: 09 June 2021
Manuscript revised: 15 October 2021
Manuscript accepted: 09 November 2021
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Identification of Convective and Stratiform Clouds Based on the Improved DBSCAN Clustering Algorithm

    Corresponding author: Zhiqun HU, huzq@cma.gov.cn
  • 1. State Key Lab of Severe Weather, Chinese Academy of Meteorological Sciences, Beijing 100081, China
  • 2. School of Atmospheric Sciences, Chengdu University of Information Technology, Chengdu 610225, China

Abstract: A convective and stratiform cloud classification method for weather radar is proposed based on the density-based spatial clustering of applications with noise (DBSCAN) algorithm. To identify convective and stratiform clouds in different developmental phases, two-dimensional (2D) and three-dimensional (3D) models are proposed by applying reflectivity factors at 0.5° and at 0.5°, 1.5°, and 2.4° elevation angles, respectively. According to the thresholds of the algorithm, which include echo intensity, the echo top height of 35 dBZ (ET), density threshold, and ε neighborhood, cloud clusters can be marked into four types: deep-convective cloud (DCC), shallow-convective cloud (SCC), hybrid convective-stratiform cloud (HCS), and stratiform cloud (SFC) types. Each cloud cluster type is further identified as a core area and boundary area, which can provide more abundant cloud structure information. The algorithm is verified using the volume scan data observed with new-generation S-band weather radars in Nanjing, Xuzhou, and Qingdao. The results show that cloud clusters can be intuitively identified as core and boundary points, which change in area continuously during the process of convective evolution, by the improved DBSCAN algorithm. Therefore, the occurrence and disappearance of convective weather can be estimated in advance by observing the changes of the classification. Because density thresholds are different and multiple elevations are utilized in the 3D model, the identified echo types and areas are dissimilar between the 2D and 3D models. The 3D model identifies larger convective and stratiform clouds than the 2D model. However, the developing convective clouds of small areas at lower heights cannot be identified with the 3D model because they are covered by thick stratiform clouds. In addition, the 3D model can avoid the influence of the melting layer and better suggest convective clouds in the developmental stage.

摘要: 基于DBSCAN聚类算法,本文提出一种用于天气雷达对流云和层状云识别的方法。为识别不同发展阶段的对流云和层状云,分别以0.5°一层仰角和0.5°、1.5°、2.4°共三层仰角的反射率因子数据构建二维模型和三维模型。根据设定的回波强度阈值、反射率因子等于35dBZ的回波顶高、密度阈值和邻域可将云团划分为深对流云、浅对流云、积层混合云和层状云四种不同的类型。每种云类又进一步识别出核心区和边界区,以提供更丰富的云结构信息。利用南京、徐州和青岛新一代S波段天气雷达观测的体扫数据对该算法进行验证,结果表明:DBSCAN算法能够识别出每种云团的核心区与边界区,分类结果直观,在对流加深过程中,随时间变化,能明显发现云团边界逐渐发展为云团核心,从不同体扫时间的识别结果可以看到一些云团的新生和消亡;由于三维模型使用多个仰角层,而且是分距离段设置密度阈值,所以造成了两个模型识别出的各类型云团的回波区域存在差异;三维模型较二维模型识别出更大面积的对流云和层状云,但不足之处是三维模型中高层较深厚的层状云可能会掩盖掉低层发展较弱、面积很小的对流云。此外,三维模型可以避免融化层的影响,更好地显示发展阶段的对流云。

    • Rainfall systems are generally composed of cumulus or mixed clouds, and convective clouds are embedded in large-scale stratiform cloud precipitation. Severe weather mainly occurs in convective cloud areas, but the duration and amount of precipitation are determined by the stratiform cloud area (Zhao and Qian, 2008; Zhou et al., 2010; Zhong et al., 2007). Because the generation mechanisms, evolution, dissipation, and moving speed differ between convective and stratiform clouds and their contribution to heat change in the atmosphere also differs, it is beneficial to better understand their mechanisms and improve precipitation estimation ability while identifying and classifying clouds in rainstorms as convective or stratiform systems. This can effectively enhance the accuracy of nowcasting, monitoring and early warning of disastrous weather, and improve aerospace operation and weather modification (Bao, 2007; Xiao and Liu, 2007; Zhong et al., 2007).

      To date, many studies have been performed on the identification of convective and stratiform clouds. The early recognition of stratiform cloud precipitation is mostly implemented by means of the 0°C-layer bright band, but this method is limited since stratiform clouds are difficult to identify before the clouds have evolved to the mature stage. Later, many methods were studied based on rain gauge data; in these methods, a threshold reflectivity factor was introduced to judge whether convective or stratiform clouds were present according to the echo intensity in a precipitation area. This approach determines the precipitation center of convective clouds and is addressed as a background-excess technique (BET) (Austin and Houze, 1972; Houze, 1973). Churchill et al. (1984) further applied the BET to a two-dimensional (2D) structure. The convective cloud center is first determined using a threshold of the radar reflectivity factor, and then a fixed-influence radius of the center is given to mark the area of convective clouds (Churchill and Houze, 1984). Steiner improved the identification effect by changing the influence radius of Churchill’s method to a function of the reflectivity factor and modifying the threshold of reflectivity to a function of the average reflectivity in the local background. The method is named the peakness threshold method and is also essentially the BET (Steiner et al., 1995). De Mott et al. (1995) suspected that convective clouds inclined with height may be misclassified if the classification is only performed at the lower level; therefore, they developed the three-dimensional (3D) peakness threshold method and extended the classification to echo tops by implementing the BET for radar volume scanning at each elevation band. Biggerstaff et al. (Biggerstaff and Listemaa, 2000) noted two main misclassifications in Steiner's algorithm: thick stratiform clouds are misclassified as convective clouds, and the periphery of the convective core is misclassified as stratiform clouds. Therefore, the vertical profile of reflectivity (VPR) and the altitude of the 0°C isotherm are applied to improve the algorithm. By means of radar reflectivity, cloud clusters are classified into convective and stratiform clouds according to their 3D structure, and this method is generally utilized in the United States at present.

      Most research on cloud classification has been based on satellite and radar data in China (Zhang et al., 2010). Stratiform cloud precipitation identification methods are proposed after the image features of precipitation stratiform clouds are analyzed comprehensively using infrared and visible channel data from the AVHRR satellite (Liu et al., 2004). Li et al. (2003) analyzed the diurnal variation characteristics of stratiform clouds in southern China using satellite, ground observation, and radiosonde data. Using the fuzzy logic method, convective and stratiform clouds are classified according to the 3D morphological distribution of radar reflectivity (Xiao and Liu, 2007).

      In recent years, artificial intelligence has developed rapidly. Machine learning, which is one of the core areas of artificial intelligence, has brought new opportunities to meteorological services. Machine learning algorithms, such as neural networks, random forests, decision trees, support vector machines, and Bayesian classifiers, are widely used in the field of meteorology (Kusiak et al., 2009; Behrang et al., 2010; Mellit et al., 2013; Ho et al., 2014; Voyant et al., 2017). Using a neural network, Pérez et al. (2009) retrieved the features of cloud images at night, and Liu et al. (2009) separated clouds from the images observed by the FY2C meteorological satellite. Based on infrared cloud images, Chen et al. (2005) combined neural networks with fuzzy logic methods to classify clouds.

      Density-based spatial clustering of applications with noise (DBSCAN) is a typical density-based clustering algorithm applied in machine learning methods that does not assign a cluster center or the number of clusters in advance and only divides different clusters according to density (Feng and Xiao, 2008). Using the data observed with the new-generation S-band Doppler weather radar (CINRAD/SA) in Nanjing, Xuzhou, and Qingdao as a demonstration, a classification method for convective and stratiform clouds is suggested herein based on the improved DBSCAN clustering algorithm.

      In section 2 of this paper, the data source and preprocessing approach are introduced briefly. In the third section, the improved DBSCAN clustering algorithm is introduced first, and then the cloud cluster classification steps with 2D and 3D models are illustrated. Three cases are analyzed to verify the addressed models in the fourth section. Finally, some conclusions and a discussion of the cloud classification algorithm are presented in the last section.

    2.   Data
    • The data used in this study were collected by SA radar in Nanjing, Xuzhou, and Qingdao. The maximum detection range of the radar is 460 km, and the radial resolution of the reflectivity factor is 1 km. The VCP21(volume coverage pattern, scan strategy #2, version 1) scan mode was adopted, which comprises nine elevation angles (0.5°, 1.5°, 2.4°, 3.4°, 4.3°, 6.0°, 9.9°, 14.6°, and 19.5°), and the duration of volume scanning was six minutes. First, when the echoes around a point were less than 70%, the point was considered an isolated point clutter and filtered out. Then, the radial velocity and spectral width were simply processed to remove nonmeteorological echoes and ground clutter by adjusting the thresholds of radial velocity, echo intensity, and intensity changes along with range and elevation angles, to reduce the misjudgment of meteorological echoes (Li et al., 2012; Wen et al., 2017). The lightning data were obtained from the ground lightning recorder in Jiangsu Province, and the 0°C-height data were obtained from the fifth generation of European Centre for Medium-Range Weather Forecast (ECMWF) atmospheric reanalysis data (ERA5).

    3.   Brief introduction of the algorithm
    • The main definitions of DBSCAN include the following.

      Density: the number of objects (points) in the area.

      Core point: if the density of one point reaches the threshold set by the algorithm, it is the core point; that is, the number of points in the neighborhood is not less than the value of Pmin, where Pmin represents the threshold of the number of points in the neighborhood with a distance of ε.

      ε neighborhood: the radius of a given core point is called the ε neighborhood of the point.

      Direct density reachability: given a set of points D, if point p is in the ε neighborhood of point q and q is a core point, then p is defined as directly density-reachable starting from q.

      Density reachability: given a point sequence q0, q1, …, qk and that the density is directly reachable from arbitrary qi to qi-1, then the density is reachable from q0 to qk, indicating the propagation of the direct density reachability.

      Density connection: if starting from a core point and points q and k are density-reachable, then q and k are density-connected.

      Boundary point: a noncore point in a cluster that is nonreachable to other points.

      Noise point: a point that does not belong to any cluster and is not density-reachable from any core point.

      The above definitions are demonstrated in Fig. 1, in which Pmin is set to 5, the red points are in the core area because of at least 5 points in the range of the ε neighborhood (black circle), and the black points are in the noncore area for less than 5 points in the ε neighborhood. All red points are direct density reachability in the core area, and these red points connected by green arrows form a density reachable sequence. All points in the ε neighborhood of points in the sequence are density connections. A new cluster with p as a core point is constructed if the points in the ε neighborhood of p contain more than the minimal number Pmin; then, a maximal-density-connected cluster is derived according to the density-reachable relation. Some clusters may be merged, and the process ends when no new point can be added to any cluster (Feng and Xiao, 2008).

      Figure 1.  Schematic diagram of the DBSCAN algorithm.

    • By means of reflectivity and echo tops from radar base data, the DBSCAN clustering algorithm in machine learning is improved to realize multiclassification of clouds. 2D and 3D models are built to identify convective and stratiform clouds at different development stages. Figure 2 is the general flow chart of the algorithm.

      Figure 2.  Flow chart of cloud cluster identification based on improved DBSCAN algorithm

    • The steps of the 2D model construction are as follows.

      (a) After quality control, reflectivity data within the range of 230 km and 0.5° elevation are selected to classify convective and stratiform clouds using different thresholds.

      (b) A point A is arbitrarily selected in the echo data (Fig. 3), and A is marked as a core point of the deep-convective cloud (DCC) if the conditions are satisfied, that is, if there are at least five points whose values are greater than or equal to 45 dBZ within the 3 × 3 window and if the height of the 35 dBZ echo top (ET) at A is more than 6 km.

      Figure 3.  A schematic of the identification process with the improved DBSCAN algorithm in which A and B are identified as the core points, C is defined as a boundary point, and N is defined as a noisy point.

      (c) Next, each point in the neighborhood of A is checked. Suppose there is a point B that satisfies the conditions given in the previous step, B is also marked as the core point of the DCC.

      (d) Continuing to check the points in the neighborhood of point A and supposing there is a point C that does not satisfy the conditions, C is marked as the boundary point of the DCC.

      (e) Supposing a point N that cannot be marked as a core point or a boundary point, N is marked as a noise point. This process continues until no new point can be marked.

      (f) Repeating the above steps, a point is marked as a shallow-convective cloud (SCC) when Z ≥ 45 dBZ and ET < 6 km or when 37 ≤ Z < 45 dBZ and ET ≥ 5 km, as a hybrid convective-stratiform cloud (HCS) when 37 ≤ Z < 45 dBZ and ET < 5 km or when 30 ≤ Z < 37 dBZ and ET ≥ 3 km, and as a stratiform cloud (SFC) for the remaining echoes.

      In general, ET refers to the height of 18 dBZ. However, due to the influence of the radar scanning strategy, this height may not be detected, which causes unrealistic ET values. The reflectivity of convective clouds is almost always greater than 35 dBZ; therefore, the convective clouds are not omitted by using the height of 35 dBZ as the value of ET.

    • For cloud clusters in different development stages, some misclassification exists if only reflectivity data at a 0.5° elevation are used. Based on the 2D model, a 3D model is further developed in which some parameters and thresholds are improved as follows.

      (a) Reflectivity data at three elevation angles, namely, 0.5°, 1.5°, and 2.4°, are used in the identification process. In the third elevation (2.4°) of the VCP21 scan strategy, the height of the echo is 13.78 km at 230 km. Therefore, it is appropriate with three elevations to build a 3D model.

      (b) Considering radar beam broadening, the 230 km range is divided into three sections and is modeled separately; these sections range from 0 km to 80 km, from 80 km to 150 km, and from 150 km to 230 km, and the corresponding Pmin thresholds are set as 10, 8, and 5, respectively.

    4.   Case analysis
    • Three cases are used to demonstrate the rationality and effectiveness of the improved DBSCAN method in cloud classification. In cases 1 and 2, convective clouds with strong development are embedded in large-scale stratiform clouds, and melting layer echoes are obvious. The contrast images are screenshot from the operational software and classified according to the fuzzy logic method (Xiao and Liu, 2007). The classification results of the two methods are basically the same, but some HCSs near the melting layer are often identified as SFCs by the fuzzy logic method. Instead of judging only by one point in the fuzzy logic method, clouds are marked according to their spatial characteristics of multiple points with the same properties in the improved DBSCAN method. Therefore, the improved DBSCAN method obtains better recognition results than the fuzzy logic method.

      Case 3 is a hail and thunderstorm gale weather event, and the identification results of convective clouds can clearly show the development process of disastrous weather.

    • A multicell precipitation weather process observed with the SA radar in Nanjing at 1000 UTC on 28 April 2015 is taken as an example. Figures 4ac show images of the 0.5°, 1.5°, and 2.4° PPIs (plan position indicators), which are blocked around the azimuth at 135°. Figures 4d and e are the results of the cloud classifications conducted with the 2D and 3D models corresponding to Fig. 4a, respectively, in which eight colors are used to represent the eight types of clouds, namely, the core and boundary areas of the DCC, SCC, HCS, and SFC types. It can be seen from Figs. 4d and e that there are many convective clouds (marked with purple and red) embedded in large-scope HCSs and SFCs (marked with yellow and green, respectively), and lightning (marked with black plus signs) occurs in the convective cloud regions.

      Figure 4.  The reflectivity PPI at 0.5°, 1.5°, and 2.4° observed with the Nanjing SA radar at 1000 UTC on 28 April 2015 (a–c), and the results of the cloud classification performed with the 2D model (d) and 3D model (e) in which the symbol _c represents the core area, and _b represents the bound area of each type of cloud cluster. The black plus signs are the locations of lightning, the black and red lines are the locations of sections corresponding to Fig. 5, and each circle is 50 km apart (same below). To compare the recognition effect, classification PPIs of the convective cloud (red area) and stratiform cloud (blue area) are attached to (f), as identified by Xiao et al. (2007) according to the fuzzy logic method.

      Clouds often occur above the 0.5° elevation height of the radar detection and exist at different elevations. Therefore, a 3D model with reflectivity at elevations of 0.5°, 1.5°, and 2.4° is more accurate for identifying cloud clusters than a 2D model. Figures 4ac show the PPIs of the lowest three elevation angles. The echo height increases with elevation, and the 3D model can distinguish the core and boundary of the cloud cluster well.

      Figures 4d and e show that the cloud classifications performed with the 2D and 3D models are basically the same, but differences are seen in some areas. For example, almost no clouds are seen in Fig. 4d, but full, weak SFCs are seen in Fig. 4e in the two boxes indicated by the black dotted line, suggesting that a weak SFC exists above the 0.5° elevation height that cannot be obtained with the 2D model. In addition, developing DCCs are marked by black ellipses in Fig. 4d, in which the core area of the DCC is still relatively small, but significantly increases in Fig. 4e, in which it is almost marked as the core of the DCC. Therefore, the 3D model has great advantages in identifying SFCs and developing DCCs. However, some developing convective clouds of small areas at the low level cannot be identified with the 3D model because they are covered by thick stratiform clouds. Therefore, the classification results may be more reasonable when the two models are combined to identify clouds together.

      To further verify the classification process, two VPRs corresponding to the black (Fig. 5a) and red (Fig. 5b) lines in Fig. 4 are demonstrated, in which the horizontal axis denotes the length of the section, the vertical axis represents the height, and the black dotted line shows the height of the 0ºC layer according to the atmospheric reanalysis data. In Fig. 5a, two DCCs are located in the horizontal distance area from 0 km to 35 km; these clouds have obvious columnar structures with large vertical thicknesses and uneven echo tops (Zhang et al., 2010), and echo intensity heights larger than 45 dBZ extend over 7 km. The maximum intensities from 70 km to 95 km and from 105 km to 115 km are over 45 dBZ, and the 35 dBZ height is above 5 km, much higher than the 3.8 km 0ºC-level height; therefore, it is reasonable to identify HCSs using the improved DBSCAN algorithm in these areas with obvious convective structures; however, these areas are identified as SFCs with the fuzzy logic method (Fig. 4f). In addition, the intensities from 50 km to 70 km in Fig. 5a are relatively weak and are correctly identified as SFCs with the improved DBSCAN algorithm.

      Figure 5.  The reflectivity of the vertical sections corresponding to the black line (a) and the red line (b) in Fig. 4 (Range is horizontal distance, height is echo height, and the black dotted line is the 3.8 km height of the 0ºC layer in Nanjing at 1200 UTC on 28 April 2015).

      In Fig. 5b, there is a strong convective cloud area from 5 km to 35 km in which the reflectivity of the echo center is larger than 65 dBZ and the 45 dBZ height is more than 12 km; this area is identified as DCCs in Fig. 4. In addition, there are several convective bubbles from 60 km to 120 km. The bubble from 60 km to 70 km is weak, and the 35 dBZ echo extends to a height of approximately 5 km and is classified as a SCC in Fig. 4. The bubble at approximately 110 km is relatively strong, showing an obvious columnar structure, and the 35 dBZ height is over 9 km, but the cloud is only marked as the SCC boundary because there are not sufficient core points in the SCC. However, this area further evolves to a core SCC area after six minutes (picture is omitted); therefore, the improved DBSCAN algorithm suggests a developing convective cloud.

      According to the above analysis, some differences exist in cloud classifications between the 2D and 3D models. To further illustrate these differences, the number of each cloud type point is calculated and shown in Fig. 6. The total points in the 2D and 3D models are 34 316 and 42 501, respectively. The numbers of convective points are slightly different between the 2D and 3D models, but the total number of points identified with the 3D model is nearly 8 000 larger than that identified with the 2D model, which is mainly caused by the difference in SFCs. There are two reasons why the DCC, HCS, and SFC areas identified with the 3D model are larger than those identified with the 2D model. One is that the 3D model is based on the reflectivity factor at three elevation angles, allowing a larger height scope for identifying points. The other is that the range of the 3D model is divided into several sections to set different density threshold values that can mark more points at far distances. In addition, the number of each type of cluster point under different parameter settings is counted, while two parameters (density threshold and ε neighborhood) of the improved DBSCAN algorithm are set differently. The SCC area identified with the 2D model is slightly higher than that identified with the 3D model, which may be caused by the weak development of the SCC during the event.

      Figure 6.  The number of points of each cloud type identified with the 2D and 3D models.

      The parameter change in the 2D model is taken as an example to illuminate the impact of the parameters on the classification. In terms of the density threshold (Table 1), the number of echo points is listed, while the density is set to 3, 4, 5, 6, and 7 points within a one-kilometer neighborhood. With the increase in the density threshold, it is found that the more severe the conditions for identifying clouds are, the smaller the echo area of the cloud cluster. In terms of the ε neighborhood (Table 2), the number of points is listed, while the threshold values of ε are given as 1 km, 2 km, 3 km, and 4 km. Contrary to the changing trend of the density threshold, the larger the ε neighborhood is, the larger the echo area of the identified cloud cluster. It is obvious from Tables 1 and 2 that the values of the density threshold and ε neighborhood greatly influence the classification results.

      Threshold of densityCloud type
      34567
      Core of DCC953812733725699
      Boundary of DCC598566508496444
      Core of SCC812779797795664
      Boundary of SCC1491135612521096914
      Core of HCS23552100199220411935
      Boundary of HCS25052233206718371524
      Core of SFC2246721560205431908515921
      Boundary of SFC61796276642465416906

      Table 1.  Point number of each cloud type obtained under different density thresholds.

      ε neighborhoodsCloud type
      1234
      Core of DCC73385710151128
      Boundary of DCC508109716952362
      Core of SCC797413278234
      Boundary of SCC1252213227053440
      Core of HCS1992202018741667
      Boundary of HCS2067350345845482
      Core of SFC20 54322 59222 43522 277
      Boundary of SFC642494761216113877

      Table 2.  Point number of each cloud type in different ε neighborhoods.

    • Another radar-based dataset taken as an example was observed with the SA radar in Xuzhou at 0754 UTC on 6 July 2019. The identifications of cloud clusters with the DBSCAN and fuzzy logic algorithms are shown in Fig. 7. The large-scope purple areas indicate that this was a strong convective weather event, and the convective clouds accompanied by lightning were very energetic. Both the 2D and 3D models can obviously classify the convective and stratiform features, and similar to case one, the 2D model marks more SCC points while the 3D model marks more SFC points.

      Figure 7.  Similar to Fig. 4, but the radar data were observed in Xuzhou at 0754 UTC 6 July 2019; the black line is the location of the section corresponding to Fig. 8.

      Comparing the classifications between adjacent times 0754 UTC and 0800 UTC (figure omitted), the evolution and dissipation of convective clouds can be found in both the 2D and 3D models; these results can improve our understanding of the mechanism of precipitation and allow us to better estimate precipitation.

      In the azimuth area from 55° to 95° and ranging from 200 km to 230 km in the 0.5° reflectivity PPI, there is a strong region within the echo over 35 dBZ; this region is marked as an HCS by the 2D and 3D models but as an SFC or as an uncertain cloud cluster by the fuzzy logic method, which considers that these strong echoes are caused by a melting layer. To verify this classification, a vertical section (Fig. 8) is made along the echo region (black line in Fig. 7a). As shown in Fig. 8, most echoes above 35 dBZ are higher than the 4.5-km height of the melting layer; therefore, it is reasonable that the cloud is marked as an HCS by the improved DBSCAN method. In addition, although the intensity is greater than 35 dBZ at the bottom of the black line shown in Fig. 7a, the cloud is marked as an SFC by both the 2D and 3D models. Figure 8 also verifies that the echo in this region is not strong.

      Figure 8.  The reflectivity vertical section corresponds to the black line in Fig. 7a, and the black dotted line is the 4.5 km height of the 0ºC layer in Xuzhou at 1200 UTC 6 July 2019.

    • Affected by the cold vortex, a large-scale severe convective weather event occurred in Shandong Province on 17 May 2020. Disasterous weather, including thunderstorms, gales of magnitude 8–10, and hail, occurred in many places that day. Taking data from the Qingdao SA radar as an example, the clouds are classified by a 3D model. There is no super-refraction within the radar detection range, as found by examining the radial velocity and the intensity vertical texture. The points of the DCC and SCC are drawn in Fig. 9a. From 0940 UTC, thunderstorms and gales appeared, while convective clouds were detected in the radar detection range. Convective clouds developed to their most energetic stage from 1300 UTC to 1500 UTC and then weakened and completely dissipated at approximately 1900 UTC; disastrous weather also dissipated at that time.

      Figure 9.  The changing trend of the number of cloud clusters identified by the 3D model; (a) 0940–1904 UTC and (b) 1128–1220 UTC (scale of the point number of the SFC boundary is on the right side of the Y-axis, and the other seven types of clouds are on the left).

      To analyze the relationship between the evolution of convection and the size of the cloud boundary area, the variations in the core and boundary points of each type of cloud from 1128 UTC to 1220 UTC are shown in Fig. 9b, and the corresponding PPI images of reflectivity are shown in Fig. 10. As shown in Fig. 10, various types of cloud clusters continued to develop during the eastward movement of the system. From 1128 UTC, the boundary of the SFC increased rapidly, which indicated that the weather system had entered the radar detection range. The core and boundary areas of other clouds increased slowly, and there was an opposite phase change between the curves of the core and boundary with time. In particular, the inverse phase change between the core and boundary curves of the DCC was more obvious, which clearly showed the dynamic evolutionary process of the convective system from occurrence to extinction. In addition, the convective clouds in classification images (Figs. 10eh) are clearer than those in the PPIs (Figs. 10ad). At this distance, the strong echoes caused by the melting layer are correctly recognized as stratiform clouds, which can aid in the better location of disastrous weather.

      Figure 10.  The reflectivity image of the 0.5° PPI observed with the Qingdao SA radar at 1128 UTC, 1146 UTC, 1203 UTC, and 1220 UTC on 17 May 2020 (a–d), and the results of the cloud classification performed with the 3D model (e–f).

    5.   Conclusions and discussion
    • By means of the improved DBSCAN algorithm, a cloud cluster with an arbitrary shape can be identified effectively, and the algorithm has no size or distribution requirements regarding the utilized dataset. A 2D and 3D model are proposed for cloud identification to classify four types of cloud clusters, namely, DCC, SCC, HCS, and SFC types, and each type is further classified into core and edge regions. The classification results can provide more abundant cloud structure information, better indicate the macro characteristics of clouds, and more accurately predict the precipitation intensity and location of disastrous weather. Three typical cases observed with SA radars in Nanjing, Xuzhou, and Qingdao are utilized to verify the effects of this classification method, and the results are compared with the classification results obtained using the fuzzy logic method.

      (a) By adjusting the threshold values of the echo intensity, density, echo top of 35 dBZ, and ε neighborhood, clouds with sufficient density can be divided into different clusters to further judge the cloud types. During the clustering process, the cluster center and number do not need to be specified initially, and clusters with arbitrary shapes can be identified in a noisy space.

      (b) The cloud clusters of the four types and their core and boundary areas are intuitive. The identified deep convective areas correspond to the locations of lightning. According to the vertical profile of reflectivity, the identification results of energetic convective clouds, weak convective bubbles, and stratiform clouds are also objective and reasonable.

      (c) Some differences were observed between the 2D and 3D models. The 3D model identifies larger convective clouds and stratiform clouds than the 2D model. In addition, weaker convective clouds with small areas in the lower layer may be covered by deeper stratiform clouds in the 3D model. Therefore, in practice, it is best to combine the classification results of the two models. According to the continuous recognition results obtained with the adjacent volume scanning data, the evolution of cloud clusters from generation to disappearance can be clearly discovered.

      (d) The echo areas are significantly influenced by the density threshold and ε neighborhood. Different parameters have different recognition results, and the classification is inaccurate if the parameter setting is unreasonable, especially for weak clouds. Therefore, the threshold of the improved DBSCAN algorithm should be improved according to more actual observation results in the future.

      (e) Traditional algorithms, such as the fuzzy logic method, only divide clouds into two categories, namely, stratiform and convective clouds, but the improved DBSCAN algorithm has a more detailed cloud classification, which can identify the boundaries and cores of four types of clouds. During convection deepening, the boundary of clouds gradually develops into the core of the clouds, and the core also develops into boundary clouds. There is an inverse phase relationship between the two evolutionary curves with time, which can better predict the occurrence, development, and dissipation of strong convective weather.

      Acknowledgements. This research was funded by the Key-Area Research and Development Program of Guangdong Province (Grant No. 2020B1111200001), the Key project of monitoring, early warning and prevention of major natural disasters of China (Grant No. 2019YFC1510304), the S&T Program of Hebei (Grant No. 19275408D), and the Scientific Research Projects of Weather Modification in Northwest China (Grant No. RYSY201905). The authors would like to sincerely thank Fen XU of Nanjing Joint Institute for Atmospheric Sciences for providing technical guidance.

Reference

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return