光学学报, 2019, 39 (9): 0930002, 网络出版: 2019-09-09   

特征变量选择和回归方法相结合的土壤有机质含量估算 下载: 1164次

Estimation of Soil Organic Matter Content Based on Characteristic Variable Selection and Regression Methods
作者单位
1 青海师范大学地理科学学院,青海省自然地理与环境过程重点实验室, 青海 西宁 810008
2 中国环境科学研究院, 北京 100012
摘要
针对高光谱数据量大、信息冗余严重的现象,应用稳定竞争性自适应重加权采样(sCARS)、连续投影算法(SPA)、遗传算法(GA)、迭代保留有效信息变量(IRIV)和稳定竞争性自适应重加权采样结合连续投影算法(sCARS-SPA),从全波段光谱数据中筛选特征变量,并利用全波段和特征波段建立偏最小二乘回归(PLSR)、支持向量机(SVM)和随机森林(RF)模型预测土壤有机质含量。结果表明, PLSR和SVM模型结合特征变量选择,不仅提高了模型运算效率,而且模型预测能力较全波段均有一定提高;RF模型采用特征变量建模,对模型精度的提高不是十分明显,但其构建模型的变量数量却显著减少,大大提高建模效率。RF模型精度优于SVM和PLSR模型,IRIV结合RF建立的土壤有机质含量预测模型,变量数仅63个,校准集和验证集模型决定系数(R2)分别为0.941和0.96,验证集相对分析误差(RPD)为4.8。与全波段建模相比,特征变量选择和回归方法相结合,在保证模型精度的同时,可有效提高建模效率。
Abstract
In view of the large amount of soil hyperspectral data and obvious spectral information redundancy, this paper aims to compare prediction abilities of multiple feature variable selection methods for estimating soil organic matter. The stability competitive adaptive reweighted sampling (sCARS), successive projections algorithm (SPA), genetic algorithm (GA), iteratively retained information variables (IRIV), and sCARS-SPA are used to select the characteristic variables from full spectral data. Based on these characteristic bands and full spectral bands, partial least squares regression (PLSR), support vector machine (SVM), and random forest (RF) models are used to predict the soil organic matter content. The results show that the PLSR and SVM models combined with variable selection can not only improve the efficiency of the model, but also improve the model prediction ability over the full bands. The accuracy of RF model constructed with characteristic variables is not obviously improved, but the variable number in the construction model is significantly reduced and the modeling efficiency is greatly improved. Overall, the RF model’s accuracy is better than those of the SVM model and the PLSR model. The variable number of the prediction model from the combination of IRIV and RF is only 63, and the coefficients of determination (R2) from calibration set and validation set are respectively 0.941 and 0.96, and the relative deviation for the validation set RPD is 4.8, showing a very good prediction capacity. Compared to modeling based on the full bands, the combination of characteristic variable selection and regression methods can effectively improve the modeling efficiency while ensuring the accuracy of the model.

李冠稳, 高小红, 肖能文, 肖云飞. 特征变量选择和回归方法相结合的土壤有机质含量估算[J]. 光学学报, 2019, 39(9): 0930002. Guanwen Li, Xiaohong Gao, Nengwen Xiao, Yunfei Xiao. Estimation of Soil Organic Matter Content Based on Characteristic Variable Selection and Regression Methods[J]. Acta Optica Sinica, 2019, 39(9): 0930002.

本文已被 9 篇论文引用
被引统计数据来源于中国光学期刊网
引用该论文: TXT   |   EndNote

相关论文

加载中...

关于本站 Cookie 的使用提示

中国光学期刊网使用基于 cookie 的技术来更好地为您提供各项服务,点击此处了解我们的隐私策略。 如您需继续使用本网站,请您授权我们使用本地 cookie 来保存部分信息。
全站搜索
您最值得信赖的光电行业旗舰网络服务平台!