高级检索
赵宇慧, 陈光华, 王紫清, 等. 2024. 机器学习在西北太平洋热带气旋生成前期大尺度环流场分型与识别中的应用[J]. 大气科学, 48(2): 671−686. doi: 10.3878/j.issn.1006-9895.2208.22074
引用本文: 赵宇慧, 陈光华, 王紫清, 等. 2024. 机器学习在西北太平洋热带气旋生成前期大尺度环流场分型与识别中的应用[J]. 大气科学, 48(2): 671−686. doi: 10.3878/j.issn.1006-9895.2208.22074
ZHAO Yuhui, CHEN Guanghua, WANG Ziqing, et al. 2024. Applying Machine Learning in Clustering and Discriminant Analysis of Large-Scale Circulation Patterns Favorable for Tropical Cyclogenesis over the Western North Pacific [J]. Chinese Journal of Atmospheric Sciences (in Chinese), 48(2): 671−686. doi: 10.3878/j.issn.1006-9895.2208.22074
Citation: ZHAO Yuhui, CHEN Guanghua, WANG Ziqing, et al. 2024. Applying Machine Learning in Clustering and Discriminant Analysis of Large-Scale Circulation Patterns Favorable for Tropical Cyclogenesis over the Western North Pacific [J]. Chinese Journal of Atmospheric Sciences (in Chinese), 48(2): 671−686. doi: 10.3878/j.issn.1006-9895.2208.22074

机器学习在西北太平洋热带气旋生成前期大尺度环流场分型与识别中的应用

Applying Machine Learning in Clustering and Discriminant Analysis of Large-Scale Circulation Patterns Favorable for Tropical Cyclogenesis over the Western North Pacific

  • 摘要: 基于1979~2020年6~11月的热带气旋最佳路径(IBTrACS)和欧洲中期天气预报中心的第五代再分析(ERA5)资料,本文根据以热带气旋(TC)生成位置为中心的850 hPa水平风场特征,采用自组织映射网络(SOM)将西北太平洋TC生成前期的低层大尺度环流场分为5型:季风辐合型(MC)、季风涡旋型(MG)、强季风槽型(SMT)、弱季风槽型(WMT)及东风波型(EW)。MC型TC生成于副热带高压南侧辐合带中,占比最高;MG、SMT与WMT三型的TC生成受季风槽相关的气旋性切变或辐合区影响;EW型TC由东风波增幅发展生成,占比最小。在对历史资料分型的基础上,为选取合适的机器学习方法用于TC环流型的自动识别,本文还对比分析了支持向量机(SVM)、k近邻(KNN)及随机森林(RF)三种方法的识别效果,结果表明:SVM的准确率达0.965,对五类环流型识别的召回率和精确率均达到0.94以上,对样本不均衡问题不敏感,并且对样本量的敏感性分析显示其在有限样本量下即可充分学习各型的环流场特征,识别效果明显优于KNN和RF。

     

    Abstract: Based on the International Best Track Archive for Climate Stewardship dataset (IBTrACS) and ERA5 850 hPa winds from June to November 1979–2020, the low-level, large-scale circulations associated with the tropical cyclogenesis over the western North Pacific can be clustered into five patterns using a self-organizing map. The five patterns are Monsoon Confluence (MC), Monsoon Gyre (MG), Strong Monsoon Trough (SMT), Weak Monsoon Trough (WMT), and Easterly Wave (EW). Tropical Cyclones (TCs) in the MC pattern form in the confluence zone south of the subtropical high, occurring in the highest proportion of TC geneses. Furthermore, cyclogeneses in the MG, SMT, and WMT patterns are affected by the cyclonic wind shear or the confluence zone related to the monsoon trough. The EW pattern with the smallest number of TC geneses features an EW directly evolving into a TC. A comparison is performed among the following three discriminant analysis models to select an optimal machine learning method for automatic pattern identification for a given TC circulation: Support Vector Machines (SVM), k-nearest neighbors, and Random Forest (RF). The results reveal that the SVM achieves the highest accuracy of 0.965 and least sensitivity to imbalanced data, with the recall rate and precision exceeding 0.94 for each circulation pattern. Moreover, model sensitivity to dataset size is evaluated. The results indicate that the SVM model can most effectively capture characteristic signals from relatively limited training data.

     

/

返回文章
返回