光谱学与光谱分析, 2020, 40 (9): 2918, 网络出版: 2020-11-30  

t-SNE降维的红松籽新旧品性近红外光谱鉴别

Identification of New and Old Pinus Koraiensis Seeds by Near-Infrared Spectroscopy (NIRs) With t-SNE Dimensionality Reduction
作者单位
东北林业大学机电工程工程学院, 黑龙江 哈尔滨 150040
摘要
新旧品性是体现红松籽食用价值和育种价值的重要标准。 贮存期长短不同的红松籽的深加工价值不同, 但是通过外观形态、 重量和质地等很难分辨。 目前仍采用传统生物化学方法对红松籽化学性质及种子活性进行检测, 判断其新旧品性, 这种方法耗时较长难以满足在线检测的需求, 并且试剂处理不当会对环境造成污染。 近红外光谱检测在食品和林业领域中被广泛应用, 对带壳坚果类林产品的定性分析有实际和指导意义。 采用近红外光谱分析技术, 对成熟于当年和往年的红松籽进行无损检测研究。 首先, 将随机抽选的120个按新旧分类的红松籽作标记, 为了减少测量过程中的漏光现象并且使实验数据具有一般性, 统一采集松籽样本同一侧面的近红外漫反射光谱; 然后, 利用标准正态变量变换(SNV)、 一阶导数以及卷积平滑(SG)算法对原始光谱进行预处理, 以减少实验过程中人为因素及预处理方式带来的影响, 突出近红外光谱的特征信息; 随后, 使用主成分分析(PCA)和t-分布邻域嵌入(t-SNE)对预处理之后的数据进行线性与非线性降维, 聚类分析并比较降维效果。 通过数据可视化以及聚类参数的输出, 比较得出效果较好的降维方案。 红松籽近红外数据应用非线性降维处理效果优于传统线性方法, 于是运用t-SNE对数据降维以得到优化后的特征变量; 最后, 以降维之后的数据作为输入, 将2/3的试样数据作为校正集用于建立新旧籽分类的支持向量机校正模型, 将1/3的试样数据作为验证集用以对模型性能进行验证。 结果表明: 使用SNV、 求导和SG叠加的方法对光谱进行预处理能够有效消除噪声, 使吸收峰更明显, 光谱轮廓更加清晰平滑, 更有助于后期模型的建立; 将数据使用t-SNE方法降至二维作为分类模型的输入, 并且当核函数选择RBF, K取值为5, γ取82.54, 惩罚系数C为383.12时, 所建立的SVM分类模型分类效果最好, 准确度可达97.5%, 平均耗时0.02 s。 利用近红外光谱分析方法能够对红松籽新旧品性实现无损检测。
Abstract
The new and old characteristics of pinus koraiensis seeds is an important property reflecting the edible value and breeding value. The pinus koraiensis seeds with a short storage period also have high deep processing value. However, it is difficult to distinguish by appearance, weight and texture. At present, traditional biochemical methods are used to detect the chemical properties and germination percentage of pinus koraiensis seeds to judge their new and old quality. It takes a long time to meet the needs of online detection, and improper treatment of chemical reagents can cause environmental pollution. Near-infrared spectroscopy (NIRS) is widely used in the field of food detection and forestry. Therefore, it has practical significance and guiding significance for qualitative analysis of nuts with shells. In this study, near infrared spectroscopy was used to conduct nondestructive testing of pinus koraiensis seeds matured in the current year and in previous years. Firstly, the 120 pinus koraiensis seeds were randomly selected and labeled according to new and old classifications. In order to reduce the leakage of light during the measurement process and make the experimental data more generally, the near-infrared diffuse reflectance spectra of pinus koraiensis seeds samples on the same side were collected uniformly. Then, the original spectrum was pretreated by using a standard normalized variable (SNV), first derivative and Savitzky-Golay (SG) algorithm, so as to reduce the influence caused by human factors and pretreatment in the experiment process, and highlight the characteristic information of the near-infrared spectrum. After that, principal component analysis (PCA) and t-distributed Stochastic Neighbor Embedding (t-SNE) were used to reduce the dimension of the pretreated data and compare the effect of dimension reduction. Through the visualization of the data and the output of the clustering parameters, a better dimension reduction scheme was obtained by comparison. The non-linear dimensionality reduction method has a good effect in the near-infrared spectral data processing of pinus koraiensis seeds. Therefore, the t-SNE method was used to reduce the dimensionality of the data, and the optimal characteristic variables were obtained. Finally, taking the reduced dimension data as input. Using two-thirds of the sample data as a correction set to establish a support vector machine (SVM) correction model for classification of new and old seeds, and a third of the sample data were used as a validation set to validate the model performance. The results indicate that. The superposition of SNV, first derivative and SG to pretreat the spectrum can effectively eliminate the noise, it makes the absorption peak more obvious. Meanwhile, it also makes the spectral profile clearer and smoother, which is more conducive to the establishment of the later model. The method of t-SNE is used to reduce the data to two-dimension as the input of the classification model, and when the kernel function selects the RBF, the value of K is 5, γ is 82.54 and the penalty coefficient C is 383.12, the SVM classification model has the best classification effect, the accuracy can reach 97.5%, and the average time consumption is 0.02 s. Near-infrared spectroscopy can be used to achieve non-destructive testing of the new and old characteristics of pinus koraiensis seeds.
参考文献

[1] Jan U Porep, Dietmar R Kammerer, Reinhold Carle. Trends in Food Science & Technology, 2015, 46: 211.

[2] Guo Zhiming, Huang Wenqian, Peng Yankun, et al. Postharvest Biology and Technology, 2016, 115: 81.

[3] Veronica Loewe, Rafael Maria Navarro-Cerrillo, Juan García-Olmo, et al. Food Control, 2017, 73: 634.

[4] Cortes V, Rodriguez A, Blasco J, et al. Journal of Food Engineering, 2017, 204: 27.

[5] Toktam Mohammadi-Moghaddam, Seyed M A Razavi·Ameneh Sazgarnia·Masoud Taghizadeh. Food Measure, 2018, 12: 346.

[6] YU Hui-ling, MEN Hong-sheng, LIANG Hao, et al(于慧伶, 门洪生, 梁 浩, 等). Spectroscopy and Spectral Analysis(光谱学与光谱分析), 2018, 38(6): 1724.

[7] Patrizia Firmani, Silvia De Luca, Remo Bucci, et al. Food Control, 2019, 100: 292.

[8] Arslan M, Zou X B, Tahir H E, et al. International Journal of Food Engineering, 2019, 15(3-4): 20180148. doi: https://doi.org/10.1515/ijfe-2018-0148.

[9] Tsuchikawa S, Kobori H. Journal of Wood Science, 2015, 61(3): 213.

[10] Shimazui T, Yoshikawa K, Kojima T, et al. Holzforschung, 2015, 69(3): 329.

[11] ZHANG Yi-zhuo, SU Yao-wen, LI Chao, et al(张怡卓, 苏耀文, 李 超, 等). Journal of Forestry Engineering(林业工程学报), 2016, 1(6): 17.

[12] Todorovic N, Popovic Z, Milic G. Wood Science & Technology, 2015, 49(3): 527.

[13] WANG Zhen-hao, DU Hong-jin, LI Guo-qing, et al(王振浩, 杜虹锦, 李国庆, 等). Power System Protection and Control(电力系统保护与控制), 2018, 22: 64.

李鸿博, 曹军, 蒋大鹏, 张冬妍, 张怡卓. t-SNE降维的红松籽新旧品性近红外光谱鉴别[J]. 光谱学与光谱分析, 2020, 40(9): 2918. LI Hong-bo, CAO Jun, JIANG Da-peng, ZHANG Dong-yan, ZHANG Yi-zhuo. Identification of New and Old Pinus Koraiensis Seeds by Near-Infrared Spectroscopy (NIRs) With t-SNE Dimensionality Reduction[J]. Spectroscopy and Spectral Analysis, 2020, 40(9): 2918.

关于本站 Cookie 的使用提示

中国光学期刊网使用基于 cookie 的技术来更好地为您提供各项服务,点击此处了解我们的隐私策略。 如您需继续使用本网站,请您授权我们使用本地 cookie 来保存部分信息。
全站搜索
您最值得信赖的光电行业旗舰网络服务平台!