光谱学与光谱分析, 2020, 40 (2): 512, 网络出版: 2020-05-12  

可能模糊鉴别C均值聚类的茶叶FTNIR分类研究

Classification of FTNIR Spectra of Tea via Possibilistic Fuzzy Discriminant C-Means Clustering
作者单位
1 滁州职业技术学院信息工程系, 安徽 滁州 239000
2 江苏大学电气信息工程学院, 江苏 镇江 212013
3 江苏大学机械工业设施农业测控技术与装备重点实验室, 江苏 镇江 212013
摘要
茶叶傅里叶近红外光谱(FTNIR)中含有茶叶的有机物化学成分信息, 不同品种茶叶的化学成分和含量都有差异, 所以利用傅里叶近红外光谱进行茶叶品种分类是可行的。 由于茶叶近红外光谱数据具有维数高, 有波峰和波谷, 光谱重叠交错等特点, 所以准确分类光谱数据存在困难。 为此, 提出一种可能模糊鉴别C均值聚类(PFDCM)算法, 将模糊线性判别分析(FLDA)引入到可能模糊C均值聚类(PFCM)算法中, 在模糊聚类过程中FLDA可提取茶叶近红外光谱的鉴别信息和进行数据空间的转换。 PFDCM在对茶叶光谱进行模糊聚类后得到的模糊隶属度和典型值可实现茶叶近红外光谱的准确聚类, 具有聚类速度快, 准确率高等优点。 由于PFDCM的典型值没有隶属度之和为1的约束条件, 因而PFDCM在聚类含噪声的光谱数据方面优于模糊C均值聚类(FCM)。 采集岳西翠兰, 六安瓜片, 施集毛峰和黄山毛峰四种茶叶共260个样本, 采用Antaris Ⅱ型傅里叶近红外光谱仪采集茶叶的傅里叶近红外光谱。 光谱波数范围为10 000~4 000 cm-1, 实验所得近红外光谱为1 557维的高维数据。 首先, 将光谱数据用多元散射校正(MSC)进行预处理以减少光谱散射和噪声影响和增加信噪比; 其次, 用主成分分析法(PCA)降低光谱数据空间的维数, 经过PCA处理后光谱数据维数为7; 然后, 用线性判别分析(LDA)提取光谱数据中的鉴别信息并将光谱数据空间的维数进一步降低到3维; 最后, 分别用FCM, 可能模糊C均值聚类(PFCM)和PFDCM进行数据的聚类分析, 实现茶叶品种的准确分类。 实验结果: 权重指数m=2.0, η=2.0, FCM, PFCM和PFDCM聚类算法的聚类准确率分别为93.60%, 93.02%和98.84%; FCM收敛时共迭代25次, 而PFCM和PFDCM收敛时分别迭代8次和23次; 模糊聚类收敛所消耗的时间, FCM最少, 而PFDCM最多。 FTNIR技术结合MSC, PCA, LDA和PFDCM提供了一种实现茶叶品种准确鉴别的分类模型。
Abstract
Fourier transform near-infrared spectroscopy (FTNIR) spectra contain valuable information about the chemical constituents of tea. Furthermore, the chemical constituents and their content of tea reveal differences concerning different kinds of tea and, therefore, it is feasible to classify tea varieties by FTNIR. FTNIR spectra have the characteristics of high dimension, crests and troughs, spectral overlapping and staggering, so it is difficult to classify spectra. In order to solve this problem, possibilistic fuzzy discriminant c-means clustering (PFDCM) was proposed by introducing fuzzy linear discriminant analysis (FLDA) into possibilistic fuzzy c-means clustering (PFCM) for purpose of discriminating FTNIR spectra correctly. Interestingly, during fuzzy clustering FLDA can not only extract discriminant information from FTNIR spectra but can transform the data space. PFDCM can achieve the accurate classification of FTNIR spectra according to its fuzzy membership and typicality values, and it has some advantages such as fast speed and high accuracy. PFDCM is superior to fuzzy c-means (FCM) clustering in clustering spectra containing noisy data because the typicality values of PFDCM are no constraint that the sum of the membership degrees is one. Four varieties of tea samples, called Yuexi Cuilan, Lu’an Guapian, Shiji Maofeng and Huangshan Maofeng, were collected in this study, and a total of 260 tea samples were scanned over the range of 10 000~4 000 cm-1 by FTNIR spectrometer, and in the end the 1 557-dimensional data were acquired for further processing. For a start, spectral data were pretreated with multiplicative scatter correction (MSC) to reduce spectra scattering and noise effect and increase signal-to-noise ratio. Secondly, principal component analysis (PCA) was used to reduce the dimensionality of FTNIR spectra to seven. Thirdly, discriminant information was extracted from spectra and the dimensionality of data was transformed from seven to three by linear discriminant analysis (LDA). Finally, fuzzy c-means (FCM) clustering, PFCM and PFDCM were put into use, clustering data to classify tea variety correctly. The experimental results showed that under the condition of the weight index m=2.0 and η=2.0, the clustering accuracy rates of FCM, PFCM and PFDCM achieved 93.60%, 93.02% and 98.84%, respectively. After 25 iterations, FCM converged, but PFCM and PFDCM achieved 8 iterations and 23 iterations, respectively, and converged. As fuzzy clustering algorithms converged, FCM consumed the least time but the most time-consuming clustering was PFDCM. In conclusion, FTNIR coupled with MSC, PCA, LDA and PFDCM presented a classification model for the accurate identification of tea varieties.

武斌, 傅海军, 武小红, 陈勇, 贾红雯. 可能模糊鉴别C均值聚类的茶叶FTNIR分类研究[J]. 光谱学与光谱分析, 2020, 40(2): 512. WU Bin, FU Hai-jun, WU Xiao-hong, CHEN Yong, JIA Hong-wen. Classification of FTNIR Spectra of Tea via Possibilistic Fuzzy Discriminant C-Means Clustering[J]. Spectroscopy and Spectral Analysis, 2020, 40(2): 512.

关于本站 Cookie 的使用提示

中国光学期刊网使用基于 cookie 的技术来更好地为您提供各项服务,点击此处了解我们的隐私策略。 如您需继续使用本网站,请您授权我们使用本地 cookie 来保存部分信息。
全站搜索
您最值得信赖的光电行业旗舰网络服务平台!