A Study on Retrieving Cloud Classification and Cloud Height Using FY-4A and CloudSat Satellite Data via Random Forest Algorithms
-
Abstract
Cloud classification is a crucial part of meteorological research and operational applications. Traditional methods are limited by the insufficient spectral or vertical structure information of single-satellite data. This study proposes a multi-source satellite data fusion cloud classification method based on the random forest algorithm. By using the high spatial and temporal resolution radiation data of the Fengyun - 4A (FY-4A) satellite and the vertical observation data of the CloudSat Cloud Profiling Radar (CPR), a cloud type identification and cloud height inversion model is constructed. In order to acquire the train data, the radiation data of 14 channels of FY-4A (including visible light, shortwave infrared, and longwave infrared bands) and the cloud classification mask parameters of CloudSat are matched spatially and temporally. Specific channel combinations are selected as input features according to the physical characteristics of channel radiation and statistical laws. A classification model based on the random forest algorithm is used to achieve fine - grained cloud type identification, and a regression model is used to predict the cloud base and cloud top heights based on the classification results. In addition, this paper designs a common random forest model (Model A) and a hierarchical architecture random forest model (Model B) for the classification model and compares their performances. The results show that the classification accuracies of the models in the daytime scene reach 94.2% and 95.7%, respectively. Under the condition of ignoring clear - sky, the R2 scores of cloud top height regression are close to 0.98 and 0.99, respectively. In the night scene, limited by the lack of short - wave channels, the classification accuracies are 92.0% and 93.08%, respectively. Finally, two typhoon cases are used in this paper for comparative analysis to verify the highly efficient identification ability of the model for strong convective cloud systems. However, the inversion accuracy of the model in low cloud and thin fog scenes still needs to be improved. This study utilizes the respective advantages of geostationary satellites and polar-orbiting satellites, conducts cloud retrieval with the help of the random forest algorithm, and improves the interpretability of the model. This study provides a technical reference for meteorological applications of multi-source satellite collaborative observations and machine learning integration.
-
-