激光与光电子学进展, 2019, 56 (3): 033001, 网络出版: 2019-07-31
基于网格划分局部线性嵌入算法的近红外光谱相似性度量方法 下载: 1083次
Similarity Measurement Method of Near Infrared Spectrum Based on Grid Division Local Linear Embedding Algorithm
光谱学 近红外光谱 相似性度量 改进局部线性嵌入算法 网格子空间 测地线距离 高维数据 spectroscopy near-infrared spectrum similarity measurement improved local linear embedding algorithm grid subspace geodesic distance high-dimensional data
摘要
近红外光谱数据的高维、高冗余、高噪声和非线性的特性严重影响了光谱相似性度量的准确性,针对该问题,提出了一种基于网格划分局部线性嵌入(GGLLE)算法的近红外光谱相似性度量方法。首先,根据关键化学成分在光谱中的表达,将高维光谱数据划分为多个网格子空间。其次,对局部线性嵌入(LLE)算法做了两方面改进,并采用改进的LLE算法依次实现每个子空间从高维空间向低维空间的特征映射,计算生成子空间的相似度矩阵。最后,将子空间相似度矩阵归一化处理并求解所累加和生成光谱样本集的相似度矩阵,实现光谱的相似性度量。实验选取两组某烟草企业提供的烟叶光谱构建了光谱的相似性度量模型,以相似性度量的准确率作为算法优劣的衡量标准。实验结果表明,GGLLE算法构建的相似性度量模型的准确率为93.3%,明显优于主成分分析、栈式自编码器和LLE算法的64.2%、67.5%和82.5%,从而证明了GGLLE算法的有效性。
Abstract
The high-dimension, high-redundancy, high-noise and nonlinear characteristics of near-infrared spectroscopy data seriously affect the accuracy of spectral similarity measurement. Aiming at this problem, a similarity measurement method of the near infrared spectrum based on the grid division local linear embedding (GGLLE) algorithm is proposed. First, the high-dimensional spectral data is divided into multiple grid subspaces according to the expression of key chemical components in the spectrum. Second, two aspects for the local linear embedding (LLE) algorithm are improved, and the improved LLE algorithm is used to sequentially map the feature of each subspace from high- to low-dimensional space and calculate the similarity matrix of the generated subspace. Finally, the subspace similarity matrix is normalized, and the similarity matrix of the accumulated and generated spectral sample set is to be solved to realize a similarity measurement of the spectrum. Two sets of tobacco leaf spectral data provided by a tobacco company are selected to construct a model of the spectral similarity measurement. The accuracy of the similarity measurement is a criterion of the pros and cons of the algorithm. The experimental results show that the accuracy of the similarity measurement model constructed by the GGLLE algorithm is 93.3%, which is obviously better than the accuracies achieved by principal component analysis, stacked auto encoders, and LLE algorithms, which are 64.2%, 67.5%, and 82.5%, respectively. Thus, the effectiveness of the GGLLE algorithm is proved.
徐宝鼎, 丁香乾, 秦玉华, 侯瑞春, 张磊. 基于网格划分局部线性嵌入算法的近红外光谱相似性度量方法[J]. 激光与光电子学进展, 2019, 56(3): 033001. Baoding Xu, Xiangqian Ding, Yuhua Qin, Ruichun Hou, Lei Zhang. Similarity Measurement Method of Near Infrared Spectrum Based on Grid Division Local Linear Embedding Algorithm[J]. Laser & Optoelectronics Progress, 2019, 56(3): 033001.