高级检索

基于随机森林算法的FY-4A与CloudSat卫星数据融合云分类及高度反演

A Study on Retrieving Cloud Classification and Cloud Height Using FY-4A and CloudSat Satellite Data via Random Forest Algorithms

  • 摘要: 云分类是气象研究及业务化应用的关键环节,传统方法受限于单一卫星数据的光谱或垂直结构信息不足。本研究提出了一种基于随机森林算法的多源卫星数据融合云分类方法,利用风云四号A星(FY-4A)高时空分辨率的辐射资料与CloudSat卫星云剖面雷达(CPR)的垂直观测数据,构建了云类型识别及云高度反演模型。为了获取训练数据本文将FY-4A的14个通道辐射数据(包括可见光、短波红外和长波红外波段)与CloudSat的云分类掩码参数进行时空匹配。根据通道辐射物理特性和统计规律选取了特定的通道组合作为输入特征,利用基于随机森林算法的分类模型实现云类型精细化识别,并基于分类结果使用回归模型预测云底和云顶高度。此外,本文分类模型分别设计了普通随机森林模型(模型A)和分层架构随机森林模型(模型B)并进行性能对比。结果表明,白昼场景下模型分类准确率分别达94.2%和95.7%。在忽略晴空的条件下,云顶高度回归R2分数分别接近0.98和0.99;夜晚场景受限于短波通道缺失,分类精度分别为92.0%和93.08%。最后本文使用两个台风个例对比分析验证了模型对强对流云系的高效识别能力。然而,模型在低云及薄雾场景的反演精度仍需改进。本研究创新性地利用静止卫星和极轨卫星各自的优点,借助随机森林算法进行云反演,并提升了模型的可解释性。本研究为多源卫星协同观测与机器学习融合的气象应用提供了技术参考。

     

    Abstract: Cloud classification is a crucial part of meteorological research and operational applications. Traditional methods are limited by the insufficient spectral or vertical structure information of single-satellite data. This study proposes a multi-source satellite data fusion cloud classification method based on the random forest algorithm. By using the high spatial and temporal resolution radiation data of the Fengyun - 4A (FY-4A) satellite and the vertical observation data of the CloudSat Cloud Profiling Radar (CPR), a cloud type identification and cloud height inversion model is constructed. In order to acquire the train data, the radiation data of 14 channels of FY-4A (including visible light, shortwave infrared, and longwave infrared bands) and the cloud classification mask parameters of CloudSat are matched spatially and temporally. Specific channel combinations are selected as input features according to the physical characteristics of channel radiation and statistical laws. A classification model based on the random forest algorithm is used to achieve fine - grained cloud type identification, and a regression model is used to predict the cloud base and cloud top heights based on the classification results. In addition, this paper designs a common random forest model (Model A) and a hierarchical architecture random forest model (Model B) for the classification model and compares their performances. The results show that the classification accuracies of the models in the daytime scene reach 94.2% and 95.7%, respectively. Under the condition of ignoring clear - sky, the R2 scores of cloud top height regression are close to 0.98 and 0.99, respectively. In the night scene, limited by the lack of short - wave channels, the classification accuracies are 92.0% and 93.08%, respectively. Finally, two typhoon cases are used in this paper for comparative analysis to verify the highly efficient identification ability of the model for strong convective cloud systems. However, the inversion accuracy of the model in low cloud and thin fog scenes still needs to be improved. This study utilizes the respective advantages of geostationary satellites and polar-orbiting satellites, conducts cloud retrieval with the help of the random forest algorithm, and improves the interpretability of the model. This study provides a technical reference for meteorological applications of multi-source satellite collaborative observations and machine learning integration.

     

/

返回文章
返回