光谱学与光谱分析, 2023, 43 (3): 903, 网络出版: 2023-04-07  

基于近红外-可见光高光谱的堆叠泛化模型褐土有机质预测

VIS-NIR Hyperspectral Prediction of Soil Organic Matter Based on Stacking Generalization Model
作者单位
1 山西农业大学农业工程学院, 山西 太谷 030801
2 山西农业大学谷子研究所, 山西 长治 046000
摘要
准确预测农田土壤有机质含量有助于评估农田肥力状况, 为精准农业提供数据依据。 为解决单模型实现快速估测农田土壤表层有机质含量精度较低和泛化能力较弱的问题, 以山西省典型褐土农田表层土壤为研究对象, 基于近红外-可见光高光谱数据, 提出了一种堆叠泛化模型(SGM)用于预测有机质含量。 首先对原高光谱数据采用小波平滑, 对平滑数据进行倒数一阶微分、 对数倒数一阶微分变换, 采用相关系数与递归特征消除法进行特征波段提取。 同时, 引入机器学习中的集成学习随机森林Random Forest(RF)、 梯度提升决策树Gradient Boosting Decision Tree (GBDT)、 极限梯度提升eXtreme Gradient Boosting (XGBoost)、 AdaBoost 4个初级机器学习器模型通过5折交叉验证对有机质含量进行预测, 在初级学习器预测结果基础上, 采用随机梯度下降SGD (stochastic gradient descent)作为元学习器建立SGM堆叠泛化模型。 突破单模型精度较低和不稳定的制约, 实现有机质含量的快速稳定检测。 结果表明: 倒数一阶微分变换后的光谱信息与有机质含量具有较好的相关性, 相关性最大值达到了-0.611; 相比单模型, 堆叠泛化预测模型的决定系数(R2)和相对分析误差(RPD)分别为0. 819和2.256, 较其他算法平均决定系数(R2)和平均相对分析误差(RPD)分别提高了0.055和0.323; 平均绝对误差(MAE)、 均方根误差(RMSE)分别为1.742和2.308 g·kg-1, 较其他算法平均绝对误差(MAE)和平均均方根误差(RMSE)分别降低了0.406和0.389 g·kg-1, 优化效果明显, 可用于农田土壤表层有机质含量的有效估测。 研究成果可为农田土壤表层有机质含量的高光谱快速检测提供依据和参考。
Abstract
Accurate prediction of soil organic matter content is helpful in evaluating farmland fertility and provide a data for precision agriculture. In order to solve the problems of low accuracy and weak Generalization ability of a single model for rapid estimation of organic matter content in farmland surface soil. The surface soil of typical cinnamon farmland in Shanxi Province was studied,a Stacked Generalization Model (SGM) was proposed based on VIS-NIR hyperspectral data for predicting organic matter content. Firstly, the original hyperspectral data are smoothed by wavelet, and the reciprocal derivative and logarithmic reciprocal derivative transform are performed on the smoothed data. The feature bands are extracted by correlation coefficient and recursive feature elimination method. At the same time, Ensemble learning Random Forest (RF), Gradient Boosting Decision Tree (GBDT) and eXtreme Gradient Boosting are introduced in machine learning (XGBoost), and Adaboost are used to predict organic matter content through 5-fold cross-validation. Based on the prediction results of the primary learner, Stochastic gradient Descent (SGD) is used as a meta-learner to establish the SGM stack generalization model. The limitation of low accuracy and instability of a single model is broken through to realize the rapid and stable detection of organic matter content. The results show a good correlation between the spectral information and organic matter content after the penultimate differential transformation, and the maximum correlation is -0.611. Compared with the single model, the decision coefficient (R2) and relative analysis error (RPD) of the stacked generalization prediction model are 0.819 and 2.256, respectively, which are 0.055 and 0.323 higher than the average decision coefficient (R2) and relative analysis error (RPD) of other algorithms, respectively. The mean absolute error (MAE) and root mean square error (RMSE) are 1.742 and 2.308 g·kg-1, respectively, which are 0.406 and 0.389 g·kg-1lower than those of other algorithms. The optimization effect is obvious. It can be used to estimate organic matter content in farmland soil surfaces effectively. The results can provide a basis and reference for the rapid detection of organic matter content in farmland soil surface by hyperspectral method.

张秀全, 李志伟, 郑德聪, 宋海燕, 王国梁. 基于近红外-可见光高光谱的堆叠泛化模型褐土有机质预测[J]. 光谱学与光谱分析, 2023, 43(3): 903. ZHANG Xiu-quan, LI Zhi-wei, ZHENG De-cong, SONG Hai-yan, WANG Guo-liang. VIS-NIR Hyperspectral Prediction of Soil Organic Matter Based on Stacking Generalization Model[J]. Spectroscopy and Spectral Analysis, 2023, 43(3): 903.

关于本站 Cookie 的使用提示

中国光学期刊网使用基于 cookie 的技术来更好地为您提供各项服务,点击此处了解我们的隐私策略。 如您需继续使用本网站,请您授权我们使用本地 cookie 来保存部分信息。
全站搜索
您最值得信赖的光电行业旗舰网络服务平台!