光谱学与光谱分析, 2021, 41 (6): 1782, 网络出版: 2021-07-16   

最小角回归结合竞争性自适应重加权采样的近红外光谱波长选择

Least Angle Regression Combined With Competitive Adaptive Re-Weighted Sampling for NIR Spectral Wavelength Selection
作者单位
1 桂林电子科技大学计算机与信息安全学院, 广西 桂林 541004
2 桂林电子科技大学商学院, 广西 桂林 541004
3 北京邮电大学人工智能学院, 北京 100876
4 中国食品药品检定研究院, 北京 100050
摘要
近红外光谱分析技术对检测样品无损伤且检测速度快、 精度高, 因此被广泛应用在了药品检测、 石油化工等领域, 尤其近年来机器学习和深度学习建模方法的深入应用使其具备了更准确的检测性能。 然而, 样品的近红外光谱数据具有比较高的维度且存在谱间重合、 共线性和噪声等问题, 对近红外光谱模型的性能产生消极影响, 此时样品有效特征波长的筛选极为重要。 为了提高近红外光谱定量和定性分析模型的准确性和可靠性, 提出了一种近红外光谱变量选择方法, 其结合了最小角回归(LAR)和竞争性自适应重加权采样(CARS)的优点, 具有更优的性能。 该方法利用LAR初步筛选样品全谱区的特征波长, 接着利用CARS对筛选出来的特征波长进一步选择, 从而有效去除无关特征波长。 为验证该方法的有效性, 从定量和定性分析两个方面评价该方法。 在定量分析实验中, 以FULL, LAR, CARS, SPA和UVE作为对比方法, 以药品样品数据集为实例建立PLS回归分析模型, 经LAR-CARS筛选出的变量建立的PLS模型在药品数据集表现出较高的预测决定系数和较低的预测标准偏差。 在定性分析实验中, 以SVM, ELM, SWELM和BP作为对比方法、 不同比例训练集的药品数据集为实例建立分类模型, 经LAR-CARS筛选出的变量建立的SVM分类模型精度最高达100%。 从实验结果可见, LAR-CARS可有效的筛选出表征样品特征的波长, 利用其筛选出的波长建立的定量、 定性分析模型具有更好的鲁棒性, 可用于样品光谱的特征波长筛选。
Abstract
Near-infrared spectroscopy is widely used in drug detection, petrochemical industry, etc., because it has no damage to the samples, and the detection speed is fast, and the accuracy is high. In particular, it has more accurate detection performance with the in-depth application of machine learning and deep learning modeling methods in recent years makes. However, the NIR spectral data of the sample has relatively high dimensions and has problems such as spectral overlap, collinearity and noise, which will negatively impact the performance of the NIR spectral model. In this case, the selection of effective characteristic wavelength points of the sample is extremely important. In order to improve the accuracy and reliability of the quantitative and qualitative analysis models of NIR spectra, a variable selection method for NIR spectra is proposed, which combines the advantages of Least Angle Regression and Competitive Adaptive Re-weighted Sampling, and has better performance. In this method, LAR was used to preliminarily screen the characteristic wavelengths in the whole spectrum of the sample, and then CARS was used to further select the selected characteristic wavelengths to effectively remove the irrelevant characteristic wavelengths. In order to verify the effectiveness of the method, the method was evaluated from two aspects of quantitative and qualitative analysis. In the quantitative analysis experiment, PLS regression analysis model was established using FULL, LAR, CARS, SPA and UVE as comparison methods and drug sample data set as example. PLS model established by variables screened by LAR-CARS showed higher predictive determination coefficient and lower predictive standard deviation in drug data set. In the qualitative analysis experiment, the classification model was established with SVM, ELM, SWELM and BP as comparison methods and drug data sets with different proportions of training sets. The accuracy of the SVM classification model established by the variables screened by LAR-CARS reached the highest 100%. From the experimental results, it can be seen that LAR-CARS can effectively select the wavelength points that the characteristics of the sample, and the quantitative and qualitative analysis model established by using the selected wavelength points has better robustness and can be used for the characteristic wavelength screening of the sample spectrum.
参考文献

[1] Yazici A, Tiryaki G Y, Ayvaz H, et al. Journal of the Science of Food and Agriculture, 2020, 100(5): 1980.

[2] Greene T P, Gullysantiago M A, Barsony M. The Astrophysical Journal, 2018, 862(1): 85.

[3] Li S, Xing B, Lin D, et al. Industrial Crops and Products, 2020, 152(11): 112539.

[4] ZHANG Feng, TANG Xiao-jun, TONG Ang-xin, et al(张 峰, 汤晓君, 仝昂鑫, 等). Chinese Journal of Scientific Instrument(仪器仪表学报), 2020, 41(1): 64.

[5] Felipe C B, Espinosa R M, Hevia G, et al. Journal of Sports Sciences, 2019, 37(23): 1.

[6] Liu J, Zhang Y, Wang H, et al. Spectrochimica Acta Part A: Molecular & Biomolecular Spectroscopy, 2018, 199(21): 43.

[7] Wang Y, Guo W, Zhu X, et al. International Journal of Food Science & Technology, 2019, 54(2): 387.

[8] Tsakiridis N L, Tziolas N V, Theocharis J B, et al. European Journal of Soil Science, 2019, 70(3): 578.

[9] WANG Kun, WU Jing-zhu, WANG Dong, et al(王 坤, 吴静珠, 王 冬, 等). Journal of Food Safety and Quality Detection Technology(食品安全质量检测学报), 2020, 11(16): 5569.

[10] LI Xin-xing, YAO Jiu-bin, CHENG Jian-hong, et al(李鑫星, 姚久彬, 成建红, 等). Spectroscopy and Spectral Analysis(光谱学与光谱分析), 2020, 40(1): 189.

[11] ZHAO Huan, HUAN Ke-wei, SHI Xiao-guang, et al(赵 环, 宦克为, 石晓光, 等). Chinese Journal of Analytical Chemistry(分析化学), 2018, 65(1): 136.

[12] Chen F C, Jahanshahi M R. IEEE Transactions on Industrial Electronics, 2018, 65(5): 4392.

[13] Zhang J, Lu Z, Li M, et al. IEEE Access, 2019, 7(1): 183118.

[14] YANG Zhen-fa, XIAO Hang, ZHANG Lei, et al(杨振发, 肖 航, 张 雷, 等). Chinese Journal of Analytical Chemistry(分析化学), 2020, 48(2): 275.

[15] Krongchai C, Wongsaipun S, Funsueb S, et al. Chiang Mai Journal of Science, 2020, 41(1): 160.

[16] Lu B, Liu N, Li H, et al. Soil & Tillage Research, 2019, 191(12): 266.

[17] Zhang R, Zhang F, Chen W, et al. Chemometrics & Intelligent Laboratory Systems, 2018, 175(11): 47.

路皓翔, 张静, 李灵巧, 刘振丙, 杨辉华, 冯艳春, 尹利辉. 最小角回归结合竞争性自适应重加权采样的近红外光谱波长选择[J]. 光谱学与光谱分析, 2021, 41(6): 1782. LU Hao-xiang, ZHANG Jing, LI Ling-qiao, LIU Zhen-bing, YANG Hui-hua, FENG Yan-chun, YIN Li-hui. Least Angle Regression Combined With Competitive Adaptive Re-Weighted Sampling for NIR Spectral Wavelength Selection[J]. Spectroscopy and Spectral Analysis, 2021, 41(6): 1782.

本文已被 1 篇论文引用
被引统计数据来源于中国光学期刊网
引用该论文: TXT   |   EndNote

相关论文

加载中...

关于本站 Cookie 的使用提示

中国光学期刊网使用基于 cookie 的技术来更好地为您提供各项服务,点击此处了解我们的隐私策略。 如您需继续使用本网站,请您授权我们使用本地 cookie 来保存部分信息。
全站搜索
您最值得信赖的光电行业旗舰网络服务平台!