高级检索

基于气象因子的东北地区大豆产量年际变化预测:机器学习方法的应用

Meteorological factors-driven prediction of interannual variability of soybean yield in Northeast China based on machine learning methods

  • 摘要: 大豆是全球最重要的油料作物和植物蛋白来源以及全球四大粮食作物之一。中国是排名世界第四的大豆生产国和最大消费国,而东北地区是中国大豆主产区,其大豆产量约占全国总产量的一半。东北大豆产量的年际变化主要受气象因素驱动,准确预测气象因素主导的产量年际变化对保障粮食安全和市场稳定具有重要意义。已有的统计预测研究多集中于东北地区局部区域、部分县、或单个省,且预测时段不超过5年,或仅评估了拟合效果。针对上述问题,本研究采用岭回归、Lasso回归、支持向量机回归、K近邻回归、决策树回归和随机森林方法构建了1981~2018年东北地区省级尺度气象因子预测大豆产量年际变化模型,并评估和比较其交叉验证技巧。主要结论如下:(1)在六种机器学习方法中,岭回归在三省整体表现最优,其交叉验证相关系数在黑龙江、吉林和辽宁分别达到0.48(P<0.01)、0.58(P<0.001)和0.72(P<0.001);(2)相较于逐步线性回归,岭回归在相关系数(R)和均方根误差(RMSE)上均表现更好,仅在吉林省和辽宁省的幅度预测准确性上略低。(3)因子选择与样本叠加处理在多数情况下能提升机器学习模型的交叉验证预测技巧。(4)气象因子对产量形成的关键作用窗口集中在7至8月的开花结荚期,此期间温度、降水和日照时长的正向协同效应显著,充足的水热条件有利于花荚形成、籽粒发育及光合作用增强,从而提升最终产量。研究结果为东北大豆产量预测和农业风险管理提供了科学依据。

     

    Abstract: Soybean is one of the world"s four major grain crops and the most important source of vegetable oil and protein. China, as the world"s fourth-largest soybean producer and largest consumer, relies heavily on Northeast China for domestic production, which accounts for approximately half of the national output. The interannual variability of soybean yield in Northeast China is primarily driven by meteorological factors. Its accurate prediction is crucial for food security and market stability. Previous statistical prediction studies have been limited to local areas or single provinces and have only provided fitting skills or short-term (≤5 years) prediction skills. To address these limitations, this study developed prediction models for interannual soybean yield variations at the provincial scale in Northeast China during 1981-2018 using six machine learning methods based on meteorological factors. The main findings are: (1) Ridge regression showed the best overall performance among the six methods, with cross-validation correlation coefficients reaching 0.48 (P<0.01), 0.58 (P<0.001), and 0.72 (P<0.001) in Heilongjiang, Jilin, and Liaoning provinces, respectively; (2) Compared to stepwise linear regression, ridge regression demonstrated superior performance in a correlation coefficient (R) and root mean square error (RMSE), with slightly lower accuracy only in amplitude prediction for Jilin and Liaoning provinces; (3) Predictor selection and sample augmentation generally improved the cross-validation prediction skills of machine learning models; (4) The critical meteorological impact window concentrated in the flowering and pod-setting period (July-August), during which the positive effects of temperature, precipitation, and sunshine duration significantly enhanced final yields through promoting pod formation, grain development, and photosynthesis. These findings provide scientific support for soybean yield prediction and agricultural risk management in Northeast China.

     

/

返回文章
返回