首页 > 论文 > 光学学报 > 39卷 > 9期(pp:0930002--1)

特征变量选择和回归方法相结合的土壤有机质含量估算

Estimation of Soil Organic Matter Content Based on Characteristic Variable Selection and Regression Methods

  • 摘要
  • 论文信息
  • 参考文献
  • 被引情况
  • PDF全文
分享:

摘要

针对高光谱数据量大、信息冗余严重的现象,应用稳定竞争性自适应重加权采样(sCARS)、连续投影算法(SPA)、遗传算法(GA)、迭代保留有效信息变量(IRIV)和稳定竞争性自适应重加权采样结合连续投影算法(sCARS-SPA),从全波段光谱数据中筛选特征变量,并利用全波段和特征波段建立偏最小二乘回归(PLSR)、支持向量机(SVM)和随机森林(RF)模型预测土壤有机质含量。结果表明, PLSR和SVM模型结合特征变量选择,不仅提高了模型运算效率,而且模型预测能力较全波段均有一定提高;RF模型采用特征变量建模,对模型精度的提高不是十分明显,但其构建模型的变量数量却显著减少,大大提高建模效率。RF模型精度优于SVM和PLSR模型,IRIV结合RF建立的土壤有机质含量预测模型,变量数仅63个,校准集和验证集模型决定系数(R2)分别为0.941和0.96,验证集相对分析误差(RPD)为4.8。与全波段建模相比,特征变量选择和回归方法相结合,在保证模型精度的同时,可有效提高建模效率。

Abstract

In view of the large amount of soil hyperspectral data and obvious spectral information redundancy, this paper aims to compare prediction abilities of multiple feature variable selection methods for estimating soil organic matter. The stability competitive adaptive reweighted sampling (sCARS), successive projections algorithm (SPA), genetic algorithm (GA), iteratively retained information variables (IRIV), and sCARS-SPA are used to select the characteristic variables from full spectral data. Based on these characteristic bands and full spectral bands, partial least squares regression (PLSR), support vector machine (SVM), and random forest (RF) models are used to predict the soil organic matter content. The results show that the PLSR and SVM models combined with variable selection can not only improve the efficiency of the model, but also improve the model prediction ability over the full bands. The accuracy of RF model constructed with characteristic variables is not obviously improved, but the variable number in the construction model is significantly reduced and the modeling efficiency is greatly improved. Overall, the RF model’s accuracy is better than those of the SVM model and the PLSR model. The variable number of the prediction model from the combination of IRIV and RF is only 63, and the coefficients of determination (R2) from calibration set and validation set are respectively 0.941 and 0.96, and the relative deviation for the validation set RPD is 4.8, showing a very good prediction capacity. Compared to modeling based on the full bands, the combination of characteristic variable selection and regression methods can effectively improve the modeling efficiency while ensuring the accuracy of the model.

Newport宣传-MKS新实验室计划
补充资料

中图分类号:TP79; S151.9

DOI:10.3788/AOS201939.0930002

所属栏目:光谱学

基金项目:国家自然科学基金;

收稿日期:2019-03-05

修改稿日期:2019-05-05

网络出版日期:2019-09-01

作者单位    点击查看

李冠稳:青海师范大学地理科学学院,青海省自然地理与环境过程重点实验室, 青海 西宁 810008中国环境科学研究院, 北京 100012
高小红
肖能文:中国环境科学研究院, 北京 100012
肖云飞:青海师范大学地理科学学院,青海省自然地理与环境过程重点实验室, 青海 西宁 810008

联系人作者:李冠稳(lgw126522@163.com); 高小红(xiaohonggao226@163.com);

备注:国家自然科学基金;

【1】Nan F, Zhu H F and Bi R T. Hyperspectral prediction of soil organic matter content in the reclamation cropland of coal mining areas in the Loess Plateau. Scientia Agricultura Sinica. 49(11), 2126-2135(2016).
南锋, 朱洪芬, 毕如田. 黄土高原煤矿区复垦农田土壤有机质含量的高光谱预测. 中国农业科学. 49(11), 2126-2135(2016).

【2】Mishra U, Torn M S, Masanet E et al. Improving regional soil carbon inventories: combining the IPCC carbon inventory method with regression kriging[J]. 189/190, 288-295(2012).

【3】St Luce M, Ziadi N, Zebarth B J et al. Rapid determination of soil organic matter quality indicators using visible near infrared reflectance spectroscopy[J]. 232/233/234, 449-458(2014).

【4】Liu Y Q, Chen H Y, Wang R Y et al. Quantitative analysis of soil salt and its main ions based on visible/near infrared spectroscopy in estuary area of Yellow River. Scientia Agricultura Sinica. 49(10), 1925-1935(2016).
刘亚秋, 陈红艳, 王瑞燕 等. 基于可见/近红外光谱的黄河口区土壤盐分及其主要离子的定量分析. 中国农业科学. 49(10), 1925-1935(2016).

【5】Liu H J, Zhang B, Liu D W et al. Study on quantitatively remote sensing typical soils in Songnen plain, northeast China. Journal of Remote Sensing. 12(4), 647-654(2008).
刘焕军, 张柏, 刘殿伟 等. 松嫩平原典型土壤高光谱定量遥感研究. 遥感学报. 12(4), 647-654(2008).

【6】Lu Y L, Bai Y L, Yang L P et al. Prediction and validation of soil organic matter content based on hyperspectrum. Scientia Agricultura Sinica. 40(9), 1989-1995(2007).
卢艳丽, 白由路, 杨俐苹 等. 基于高光谱的土壤有机质含量预测模型的建立与评价. 中国农业科学. 40(9), 1989-1995(2007).

【7】Wang L S, Lu C P, Wang R J et al. Optimization for vis/NIRS prediction model of soil available nitrogen content. Chinese Journal of Luminescence. 39(7), 1016-1023(2018).
汪六三, 鲁翠萍, 王儒敬 等. 土壤碱解氮含量可见/近红外光谱预测模型优化. 发光学报. 39(7), 1016-1023(2018).

【8】Zhu Y X, Yu L, Hong Y S et al. Hyperspectral features and wavelength variables selection methods of soil organic matter. Scientia Agricultura Sinica. 50(22), 4325-4337(2017).
朱亚星, 于雷, 洪永胜 等. 土壤有机质高光谱特征与波长变量优选方法. 中国农业科学. 50(22), 4325-4337(2017).

【9】Vohland M, Ludwig M, Harbich M et al. Using variable selection and wavelets to exploit the full potential of visible-near infrared spectra for predicting soil properties. Journal of Near Infrared Spectroscopy. 24(3), 255-269(2016).

【10】Lin Z D, Wang Y B, Wang R J et al. Improvements of the vis-NIRS model in the prediction of soil organic matter content using spectral pretreatments, sample selection, and wavelength optimization. Journal of Applied Spectroscopy. 84(3), 529-534(2017).
林志丹, 汪玉冰, 王儒敬 等. 波长优选对土壤有机质含量可见光/近红外光谱模型的优化. 发光学报. 84(3), 529-534(2017).

【11】Nawar S, Buddenbaum H, Hill J et al. Estimating the soil clay content and organic matter by means of different calibration methods of vis-NIR diffuse reflectance spectroscopy. Soil and Tillage Research. 155, 510-522(2016).

【12】Viscarra Rossel R A, Rizzo R, Demattê J A M et al. . Spatial modeling of a soil fertility index using visible-near-infrared spectra and terrain attributes. Soil Science Society of America Journal. 74(4), 1293-1300(2010).

【13】Li M J, Zhang M Y, Cui L J et al. Inversion of Hg content in reed leaf using continuous wavelet transformation and random forest. Chinese Journal of Eco-Agriculture. 26(11), 1730-1738(2018).
李梦洁, 张曼胤, 崔丽娟 等. 基于连续小波变换和随机森林的芦苇叶片汞含量反演. 中国生态农业学报. 26(11), 1730-1738(2018).

【14】Ge X Y and Ding J L. W J Z, et al. Estimation of soil moisture content based on competitive adaptive reweighted sampling algorithm combined with machine learning. Acta Optica Sinica. 38(10), (2018).
葛翔宇, 丁建丽, 王敬哲 等. 基于竞争适应重加权采样算法耦合机器学习的土壤含水量估算. 光学学报. 38(10), (2018).

【15】Li G W, Gao X H, Yang L Y et al. Estimating soil organic matter contents from different soil particle size using visible and near-infrared reflectance spectrum-a case study of the Huangshui basin. Chinese Journal of Soil Science. 48(6), 1360-1370(2017).
李冠稳, 高小红, 杨灵玉 等. 不同粒径土壤有机质含量可见光-近红外光谱估算研究-以湟水流域为例. 土壤通报. 48(6), 1360-1370(2017).

【16】Conforti M, Castrignanò A, Robustelli G et al. Laboratory-based vis-NIR spectroscopy and partial least square regression with spatially correlated errors for predicting spatial variation of soil organic matter content. Catena. 124, 60-67(2015).

【17】Chen C, Lu Q P and Peng Z Q. Preprocessing methods of near-infrared spectrum based on NLMS adaptive filtering. Acta Optica Sinica. 32(5), (2012).
陈丛, 卢启鹏, 彭忠琦. 基于NLMS自适应滤波的近红外光谱去噪处理方法研究. 光学学报. 32(5), (2012).

【18】Jiang X Q, Ye Q, Lin Y et al. Inverting study on soil water content based on harmonic analysis and hyperspectral remote sensing. Acta Optica Sinica. 37(10), (2017).
姜雪芹, 叶勤, 林怡 等. 基于谐波分析和高光谱遥感的土壤含水量反演研究. 光学学报. 37(10), (2017).

【19】Zhang X Y, Li Q B and Zhang G J. Calibration transfer without standards for spectral analysis based on stable competitive adaptive re-weighted sampling. Spectroscopy and Spectral Analysis. 34(5), 1429-1433(2014).
张晓羽, 李庆波, 张广军. 基于稳定竞争自适应重加权采样的光谱分析无标模型传递方法. 光谱学与光谱分析. 34(5), 1429-1433(2014).

【20】Song X Z. Research of three new wavelength selection methods in near infrared spectroscopy quantitative analysis area. Beijing: China Agricultural University. (2017).
宋相中. 近红外光谱定量分析中三种新型波长选择方法研究. 北京: 中国农业大学. (2017).

【21】Chen H Y, Zhao G X, Zhang X H et al. Hyperspectral characteristic and estimation modeling of fluvo-aquic soil alkali hydrolysable nitrogen content based on genetic algorithm in combination with partial least squares. Chinese Agricultural Science Bulletin. 31(2), 209-214(2015).
陈红艳, 赵庚星, 张晓辉 等. 基于遗传算法结合偏最小二乘的潮土碱解氮高光谱特征及含量估测. 中国农学通报. 31(2), 209-214(2015).

【22】Yun Y H, Wang W T, Tan M L et al. A strategy that iteratively retains informative variables for selecting optimal variable subset in multivariate calibration. Analytica Chimica Acta. 807, 36-43(2014).

【23】Yu L, Hong Y S, Zhou Y et al. Wavelength variable selection methods for estimation of soil organic matter content using hyperspectral technique. Transactions of the Chinese Society of Agricultural Engineering. 32(13), 95-102(2016).
于雷, 洪永胜, 周勇 等. 高光谱估算土壤有机质含量的波长变量筛选方法. 农业工程学报. 32(13), 95-102(2016).

【24】Zhang J J, Tian Y C, Zhu Y et al. A near-infrared spectral index for estimating soil organic matter content. Chinese Journal of Applied Ecology. 20(8), 1896-1904(2009).
张娟娟, 田永超, 朱艳 等. 一种估测土壤有机质含量的近红外光谱参数. 应用生态学报. 20(8), 1896-1904(2009).

【25】Krishnan P, Alexander J D, Butler B J et al. Reflectance technique for predicting soil organic matter 1. Soil Science Society of America Journal. 44(6), 1282-1285(1980).

【26】Ben-Dor E and Banin A. Near-infrared analysis as a rapid method to simultaneously evaluate several soil properties. Soil Science Society of America Journal. 59(2), 364-372(1995).

【27】AbdelRahman A M, Pawling J, Ryczko M et al. . Targeted metabolomics in cultured cells and tissues by mass spectrometry: method development and validation. Analytica Chimica Acta. 845, 53-61(2014).

【28】Nawar S and Mouazen A M. Predictive performance of mobile vis-near infrared spectroscopy for key soil properties at different geographical scales by using spiking and data mining techniques. Catena. 151, 118-129(2017).

【29】Rossel R A V and Behrens T. Using data mining to model and interpret soil diffuse reflectance spectra. Geoderma. 158(1/2), 46-54(2010).

【30】Douglas R K, Nawar S, Alamar M C et al. regression techniques. Science of the Total Environment. 616/617, 147-155(2018).

【31】Gao H Z, Lu Q P, Ding H Q et al. Robust calibration methods of near-infrared spectrum based on random sample consensus algorithm. Acta Optica Sinica. 33(s2), (2013).
高洪智, 卢启鹏, 丁海泉 等. 基于随机抽样一致性算法的近红外光谱稳健模型研究. 光学学报. 33(s2), (2013).

引用该论文

Li Guanwen,Gao Xiaohong,Xiao Nengwen,Xiao Yunfei. Estimation of Soil Organic Matter Content Based on Characteristic Variable Selection and Regression Methods[J]. Acta Optica Sinica, 2019, 39(9): 0930002

李冠稳,高小红,肖能文,肖云飞. 特征变量选择和回归方法相结合的土壤有机质含量估算[J]. 光学学报, 2019, 39(9): 0930002

您的浏览器不支持PDF插件,请使用最新的(Chrome/Fire Fox等)浏览器.或者您还可以点击此处下载该论文PDF