基于网格划分局部线性嵌入算法的近红外光谱相似性度量方法
Similarity Measurement Method of Near Infrared Spectrum Based on Grid Division Local Linear Embedding Algorithm
摘要
近红外光谱数据的高维、高冗余、高噪声和非线性的特性严重影响了光谱相似性度量的准确性, 针对该问题, 提出了一种基于网格划分局部线性嵌入(GGLLE)算法的近红外光谱相似性度量方法。首先, 根据关键化学成分在光谱中的表达, 将高维光谱数据划分为多个网格子空间。其次, 对局部线性嵌入(LLE)算法做了两方面改进, 并采用改进的LLE算法依次实现每个子空间从高维空间向低维空间的特征映射, 计算生成子空间的相似度矩阵。最后, 将子空间相似度矩阵归一化处理并求解所累加和生成光谱样本集的相似度矩阵, 实现光谱的相似性度量。实验选取两组某烟草企业提供的烟叶光谱构建了光谱的相似性度量模型, 以相似性度量的准确率作为算法优劣的衡量标准。实验结果表明, GGLLE算法构建的相似性度量模型的准确率为93.3%, 明显优于主成分分析、栈式自编码器和LLE算法的64.2%、67.5%和82.5%, 从而证明了GGLLE算法的有效性。
关键词
Abstract
The high-dimension, high-redundancy, high-noise and nonlinear characteristics of near-infrared spectroscopy data seriously affect the accuracy of spectral similarity measurement. Aiming at this problem, a similarity measurement method of the near infrared spectrum based on the grid division local linear embedding (GGLLE) algorithm is proposed. First, the high-dimensional spectral data is divided into multiple grid subspaces according to the expression of key chemical components in the spectrum. Second, two aspects for the local linear embedding (LLE) algorithm are improved, and the improved LLE algorithm is used to sequentially map the feature of each subspace from high- to low-dimensional space and calculate the similarity matrix of the generated subspace. Finally, the subspace similarity matrix is normalized, and the similarity matrix of the accumulated and generated spectral sample set is to be solved to realize a similarity measurement of the spectrum. Two sets of tobacco leaf spectral data provided by a tobacco company are selected to construct a model of the spectral similarity measurement. The accuracy of the similarity measurement is a criterion of the pros and cons of the algorithm. The experimental results show that the accuracy of the similarity measurement model constructed by the GGLLE algorithm is 93.3%, which is obviously better than the accuracies achieved by principal component analysis, stacked auto encoders, and LLE algorithms, which are 64.2%, 67.5%, and 82.5%, respectively. Thus, the effectiveness of the GGLLE algorithm is proved.
中图分类号:O433.4
所属栏目:光谱学
基金项目:国家重点研发计划项(2017YFB1400903)
收稿日期:2018-07-06
修改稿日期:2018-08-08
网络出版日期:2018-08-17
作者单位 点击查看
丁香乾:中国海洋大学信息科学与工程学院, 山东 青岛 266100
秦玉华:青岛科技大学信息科学技术学院, 山东 青岛 266061
侯瑞春:中国海洋大学信息科学与工程学院, 山东 青岛 266100
张磊:山东烟草研究院有限公司, 山东 济南 250101
联系人作者:秦玉华(yuu71@163.com)
【3】Zhao C H, Tian M H, Li J W. Research progress on spectral similarity metrics[J]. Journal of Harbin Engineering University, 2017, 38(8): 1179-1189.
赵春晖, 田明华, 李佳伟. 光谱相似性度量方法研究进展[J]. 哈尔滨工程大学学报, 2017, 38(8): 1179-1189.
【4】Du W, Tan X L, Yi J H, et al. Evaluation of leaf tobacco quality using chemical composition data[J]. Acta Tabacaria Sinica, 2007, 13(3): 25-31.
杜文, 谭新良, 易建华, 等. 用烟叶化学成分进行烟叶质量评价[J]. 中国烟草学报, 2007, 13(3): 25-31.
【5】Cao P Y, Fu Q J, Gong H L, et al. Similarity measurement method of tobacco leaves in high dimensional space[J]. Chinese Tobacco Science, 2013, 34(3): 84-88.
曹鹏云, 付秋娟, 宫会丽, 等. 高维空间下烟叶质量相似性度量方法研究[J]. 中国烟草科学, 2013, 34(3): 84-88.
【6】Ding L, Tang P, Li H Y. Dimensionality reduction and classification for hyperspectral remote sensing data using ISOMAP[J]. Infrared and Laser Engineering, 2013, 42(10): 2707-2711.
丁玲, 唐娉, 李宏益. 基于ISOMAP的高光谱遥感数据的降维与分类[J]. 红外与激光工程, 2013, 42(10): 2707-2711.
【7】He L,Cai Y C, Yang Z. Researches on similarity measurement of high dimensional data[J]. Computer Science, 2010, 37(5): 155-156,227.
贺玲, 蔡益朝, 杨征. 高维数据的相似性度量研究[J]. 计算机科学, 2010, 37(5): 155-156,227.
【8】Tenenbaum J B. A global geometric framework for nonlinear dimensionality reduction[J]. Science, 2000, 290(5500): 2319-2323.
【9】Roweis S T. Nonlinear dimensionality reduction by locally linear embedding[J]. Science, 2000, 290(5500): 2323-2326.
【10】Gou H Y, Zhou Y, Zhu C C, et al. Semi-supervised LLE algorithm of face recognition[J]. Computer Engineering and Design, 2011, 32(8): 2825-2828, 2908.
勾红云, 周勇, 朱长成, 等. 半监督LLE人脸识别算法[J]. 计算机工程与设计, 2011, 32(8): 2825-2828,2908.
【13】Wang Y S, Yao H X, Zhao S C. Auto-encoder based dimensionality reduction[J].Neurocomputing, 2016, 184: 232-242.
引用该论文
Xu Baoding,Ding Xiangqian,Qin Yuhua,Hou Ruichun,Zhang Lei. Similarity Measurement Method of Near Infrared Spectrum Based on Grid Division Local Linear Embedding Algorithm[J]. Laser & Optoelectronics Progress, 2019, 56(3): 033001
徐宝鼎,丁香乾,秦玉华,侯瑞春,张磊. 基于网格划分局部线性嵌入算法的近红外光谱相似性度量方法[J]. 激光与光电子学进展, 2019, 56(3): 033001