光谱学与光谱分析, 2019, 39 (2): 448, 网络出版: 2019-03-06  

17种分类算法在牛肝菌种类鉴别研究中的应用

Application of 17 Classification Algorithms for Authentication Research of Various Boletus
作者单位
1 云南农业大学农学与生物技术学院, 云南 昆明 650201
2 云南省农业科学院药用植物研究所, 云南 昆明 650200
3 玉溪师范学院资源环境学院, 云南 玉溪 653100
摘要
由于部分毒菌与野生食用菌形态和生物学特征相似, 农民仅凭经验采集, 难免将两者混淆, 从而导致严重的食品安全事故。 云南省作为国内野生食用菌产量最高、 出口量最大的省份, 野生食用菌产业发展为云南农村经济发展做出了突出贡献, 对不同种类野生食用菌进行快速鉴别, 有利于野生食用菌产业的健康发展; 分析食用菌亲缘关系, 对食用菌育种工作具有积极作用。 七种牛肝菌样品, 采自云南及周边七个产地, 利用FTIR光谱仪分别采集菌柄和菌盖红外指纹图谱, 基于低级与中级数据融合策略, 将预处理后的菌柄和菌盖FTIR光谱数据进行融合, 结合Decision Trees, Discriminant Analysis, Logistic Regression Classifiers, Support Vector Machines, Nearest Neighbor Classifiers和Ensemble Classifiers中的17种算法, 分别建立菌柄、 菌盖、 低级数据融合和中级数据融合模型, 每个分类模型连续进行10次运算, 通过比较训练集分类正确率平均值, 确定牛肝菌种类鉴别最佳分类算法。 中级数据融合数据集进行系统聚类分析(HCA) , 对推测不同种类牛肝菌样品的亲缘关系进行鉴定。 结果显示: (1) 菌柄、 菌盖和低级数据融合模型最佳分类算法均为Linear Discriminant, 训练集分类正确率分别为92.8%, 96.4%和97.6%。 中级数据融合模型最佳分类算法为Subspace Discriminant, 训练集分类正确率为100%; (2) 菌柄、 菌盖、 低级数据融合和中级数据融合最佳分类模型, 全部样品分类正确率平均值分别为93.61%, 95.54%, 96.99%和99.88%, 中级数据融合模型优于其他三种模型, 表明中级数据模型可以将相似度较高的样品区分开, 且减少了产地对种类鉴别的影响; (3) 中级数据融合模型数据集进行HCA, 华丽牛肝菌和美味牛肝菌聚类距离最小, 表明这两种牛肝菌化学信息较相似, 亲缘关系较近; (4) 华丽牛肝菌与皱盖疣柄牛肝菌聚类临界值距离最大, 表明样品化学信息差异较大, 亲缘关系较远。 综上表明, 基于中级融合策略将不同部位FTIR光谱数据融合, 结合Subspace Discriminant与HCA, 可以准确鉴别不同种类牛肝菌和快速推测样品亲缘关系, 可作为野生食用菌种类鉴别与亲缘关系推测的一种新方法。
Abstract
Many wild nocuous fungi are similar to the edible in morphology and biological characteristic, which easily leads to serious food safety incident because it is difficult for farmers to distinguish them just by experience. The progress of wild edible production makes a great contribution to rural economy of Yunnan province where the yield and export volume are highest in China. Rapid authentication of wild edible fungi variety is beneficial for wild edible industry towards healthy development. Meanwhile, the authentication also contributes to the analysis of the genetic relationship between edible mushroom and their breeding. Seven kinds of fungi were collected from Yunnan and other seven origins around Yunnan. Fingerprint of caps and stipe were obtained with Fourier transforms infrared (FTIR) spectrometer, respectively. Cap model, stipe model, low-level data fusion model and mid-level data fusion were established using prepressed spectra according to low- and mid-level fusion strategy combined with decision trees, discriminant analysis, logistic regression classifiers, support vector machines, nearest neighbor classifiers and ensemble classifiers that every model was computed 10 times. The optimal classification algorithm was selected based on the accuracy of training set. Hierarchical cluster analysis (HCA) was executed using the mid-level fusion dataset to judge genetic relationship between seven fungi. The results indicated: (1) The best algorithm of caps, stipe and low-level fusion is linear discrimination that accuracy is 92.8%, 96.4%, and 97.6%, respectively. Subspace discriminant is the most optimal in mid-level fusion that accuracy is 100%. (2) The average accuracy of all samples is 93.61%, 95.54%, 96.99% and 99.88% based on the best model of stipe, cap, low-level data fusion and mid-level data fusion. The performance of mid-level fusion is better than other three models, which indicated that the model could distinguish the highly -similar samples by reducing the influence caused by their origins. (3) The result of HCA based on mid-level fusion dataset displayed that the distance between Boletus magnificus and B. edulis was very close, which showed their chemical information were similar and genetic relationship was close. (4) The result of HCA based on mid-level fusion dataset displayed that the distance between Boletus magnificus and Leccinum duriusculum was very long, which showed their chemical information were different and genetic relationship was inferior. In a word, mid-level data fusion strategy combining FTIR spectra of different parts, subspace discriminant and HCA could effectively distinguish different kinds of edible fungi and judge the genetic relationship, which is a novel method used for variety authentication and genetic relationship judgment of wild edible fungi.

张钰, 李杰庆, 李涛, 刘鸿高, 王元忠. 17种分类算法在牛肝菌种类鉴别研究中的应用[J]. 光谱学与光谱分析, 2019, 39(2): 448. ZHANG Yu, LI Jie-qing, LI Tao, LIU Hong-gao, WANG Yuan-zhong. Application of 17 Classification Algorithms for Authentication Research of Various Boletus[J]. Spectroscopy and Spectral Analysis, 2019, 39(2): 448.

关于本站 Cookie 的使用提示

中国光学期刊网使用基于 cookie 的技术来更好地为您提供各项服务,点击此处了解我们的隐私策略。 如您需继续使用本网站,请您授权我们使用本地 cookie 来保存部分信息。
全站搜索
您最值得信赖的光电行业旗舰网络服务平台!