光谱学与光谱分析, 2018, 38 (10): 3058, 网络出版: 2018-11-25  

近红外光谱的结球甘蓝可溶性糖含量测定

Prediction of Soluble Sugar Content in Cabbage by Near Infrared Spectrometer
作者单位
1 中国农业大学现代精细农业系统集成研究教育部重点实验室, 北京 100083
2 河北建筑工程学院理学院, 河北 张家口 075000
摘要
结球甘蓝是一种富含碳水化合物的常见蔬菜, 可溶性糖含量是决定其品质的重要参数。 可溶性糖易溶于水, 是蔬菜和水果口味的有效调节剂。 作为碳水化合物, 可溶性糖由三种元素C, H和O组成, 其分子吸收光谱主要由被检测材料的分子中C—H, O—H和CO等基团的组合频率吸收和倍频吸收组成, 包含丰富的有机物信息。 因此, 采用近红外光谱和化学计量学方法, 探索结球甘蓝可溶性糖含量的快速检测方法。 用德国布鲁克公司的MATRIX-Ⅰ型傅里叶变换近红外光谱仪采集161份结球甘蓝样本光谱数据。 波数范围: 12 800~4 000 cm-1(780~2 500 nm)。 蒽酮比色法测量样本的可溶性糖。 综合应用马氏距离法(MD)和蒙特卡洛交叉验证法(MCCV)剔除异常样本, 采用Kennard-Stone(K-S)法将样本按照给定比例划分为校正集和验证集。 分别使用Savitzky-Golay卷积平滑(S-G), 一阶导数(FD), 二阶导数(SD), 多元散射校正(MSC)和变量标准化(SNV)及它们的组合共12种方法对样本进行光谱预处理, 获得最佳预处理方法, 提高光谱数据的信噪比。 采用竞争性自适应重加权采样法(CARS)筛选偏最小二乘回归(PLS)模型中回归系数绝对值大的波数点, 去掉回归系数绝对值小的波数点, 以有效选择与所测特性值相关的最优波数组合, 获得具有良好鲁棒性和强预测能力的校正模型。 使用模型决定系数R2、 交互验证均方根误差(RMSECV)、 预测均方根误差(RMSEP)作为模型精度评价指标。 根据蒙特卡洛交叉验证法和马氏距离剔除异常样本的原理, 共剔除10个光谱或者化学值异常的样本。 最终参与建模分析的样本个数为151。 异常样本剔除后, 通过K-S法将样本按照3∶1被分成校正集(110个样本)和验证集(41个样本)。 使用原始光谱数据, 预处理后的光谱数据和对应于优选波数的光谱数据, 建立PLS模型。 结果表明, 利用MSC+FD光谱预处理可以提高建模精度, 校正集R2从处理前的0.68增长到0.93, MSC+FD是本研究中理想的光谱数据预处理方法。 利用CARS法共优选了84个建模波数。 在12 000~10 000 cm-1波数区域内, 有O—H键2级和C—H键3级倍频伸缩振动吸收, 此区域主要的背景信息为水和其他含氢基团, 在此区域内共包含了36个选定的波数。 在8 500~6 000 cm-1区域, 存在糖类和水的O—H键的1级倍频伸缩振动吸收, 葡萄糖的O—H键的1级倍频伸缩振动吸收, 该区域是包含反映可溶性糖成分的主要光谱区间, 背景影响较小, CARS方法在此区域共选择了15个建模波数。 5 800~4 000 cm-1区域与12 000~10 000 cm-1区域相似, 包含的选定波数多, CARS方法在此区域选择了33个建模波数。 利用CARS对参与建模的波数进行优选, 减少了无关信息, 降低了模型的复杂度, 选择的波数不但引入了表征待测组分的光谱, 同时还引入了代表背景信息的光谱, 使得校正模型适应性增强。 建立了结球甘蓝可溶性糖的全谱PLS模型, 根据CARS波数优选结果, 建立了结球甘蓝可溶性糖的CARS-PLS模型。 对于全谱PLS定量模型, 校正集的决定系数R2为0.93, RMSECV为0.157 2%, RMSEP为0.132 8%。 对于CARS-PLS模型, 校正集的决定系数R2为0.96, RMSECV为0.076 8%, RMSEP为0.059 4%。 数据表明, 两种模型具有相当的R2, 但CARS-PLS模型的RMSECV是全谱PLS模型的1/2。 RMSEP也接近1/2, CARS-PLS模型比全谱PLS定量模型所用建模变量少, 模型得到简化, 精度更优。 用CARS-PLS模型对验证集41个样本进行预测, 预测集决定系数R2为0.86, 预测标准误差为0.059 4%。 提供了一种工作效率较高的结球甘蓝质量无损检测方法。
Abstract
Soluble sugar is an effective regulator of the taste of vegetables and fruits. It is also a necessary carbohydrate absorbed and used by human beings. The head cabbage is a common vegetable rich in carbohydrates. Soluble sugar content is an important parameter in determining the nutrient quality of head cabbage. Carbohydrates are made up of carbon, hydrogen and oxygen, and the molecular absorption spectra are mainly composed of the combination bands and overtone bands of C—H, O—H and CO groups, and contain abundant organic matter information. The experiment was conducted to study the rapid detection method of soluble sugar content in head cabbage by near infrared spectroscopy and chemometrics. The experiment collected a total of 161 samples of head cabbage. The spectral data were measured by the MATRIX-I FT-NIR spectrometer made in Bruker Company, Germany, and the soluble sugar was measured by the anthrone colorimetric method. Mahalanobis Distance (MD) method and Monte Carlo cross validation (MCCV) method were used to eliminate the abnormal samples. And then the Kennard-Stone (K-S) method was used to divide all samples into a calibration set and a validation set according to the given ratio. All 12 spectral pretreatment methods including Savitzky-Golay convolution smoothing (S-G), first derivative (FD), second derivative (SD), multiple scatter correction (MSC), variable Standardization (SNV), and their combinations, were used to improve the S/N ratio to find the best pretreatment method from them. The competitive adaptive reweighted sampling (CARS) algorithm was used to select and screen out the optimal wavenumbers with the greater absolute values of the regression coefficients in the PLS model, and to remove the wavenumbers with the small regression coefficients. Thus, the best wave number combination related to the nature of the measurement can be selected to get a good calibration model with good robustness and prediction ability. The coefficient of determination (R2), root mean squared error of cross validation (RMSECV), and root mean squared error of prediction (RMSEP) were used to evaluate models. According to the principles of Monte Carlo cross validation method and Mahalanobis distance method, 10 abnormal samples were eliminated, and finally 151 samples were used in modelling. The samples were divided into calibration set (110 samples) and validation set (41 samples) according to 3∶1 by K-S method. Three PLS models were established by using the original spectral data, the preprocessed spectral data, and the spectral data with optimal wavenumbers, respectively. The modeling results showed that the spectral preprocessing method using MSC combined with FD could well improve modeling accuracy, and the R2 of the calibration model increased from 0.68 to 0.93 and was thought to be the best spectral data preprocessing method in this experiment. The CARS method was applied to select optimal wave numbers for modelling. From 12 000 to 10 000 cm-1, there exist O—H str. second overtone and C—H third overtone, and the main background information in this area is water and other groups containing hydrogen. In this region 36 optimal wavenumbers were selected. From 8 500 to 10 000 cm-1, there exist sugar’s and water’s O—H str. first overtone and glucosamine O—H str. first overtone. This region is the main spectral region containing soluble sugar information and is less affected by the background. 15 optimal wavenumbers were selected in this region. The region of 5 800 to 4 000 cm-1 is similar to the region of 12 000 to 10 000 cm-1, and contains 36 selected optimal wavenumbers. Based on the results of the CARS wave number optimization, a full spectrum PLS model and a CARS-PLS model to estimate the head cabbage soluble sugar content were established. The R2, RMSECV, and RMSEP of the full spectrum PLS model were 0.93, 0.157 2%, and 0.132 8%, respectively. While the R2, RMSECV, and RMSEP of the CARS-PLS model were 0.96, 0.076 8%, and 0.059 4%, respectively. The experimental results showed that both CARS-PLS model and full spectrum PLS model had the similar R2, but the RMSECV of the CARS-PLS model was the 1/2 of that of the full spectrum PLS model, and the RMSEP of the CARS-PLS model was also close to 1/2 of that of the full spectrum PLS model. The CARS algorithm reduced the modeling variables so that the complexity of the model was reduced, and the accuracy of CARS-PLS model was improved. The CARS-PLS model is used to predict 41 samples of the validation set. The R2 of the prediction set is 0.86 and the prediction standard error is 0.059 4%, which meant that the prediction model of soluble sugar content in head cabbage was practical. CARS algorithm can reduce the unrelated information and the complexity of the model, and the wavenumbers selected can introduce both the spectra related components information and the spectra related the background information to improve the adaptivity of the calibration model. It provides a new approach for quality evaluation of head cabbage.

李鸿强, 孙红, 李民赞. 近红外光谱的结球甘蓝可溶性糖含量测定[J]. 光谱学与光谱分析, 2018, 38(10): 3058. LI Hong-qiang, SUN Hong, LI Min-zan. Prediction of Soluble Sugar Content in Cabbage by Near Infrared Spectrometer[J]. Spectroscopy and Spectral Analysis, 2018, 38(10): 3058.

关于本站 Cookie 的使用提示

中国光学期刊网使用基于 cookie 的技术来更好地为您提供各项服务,点击此处了解我们的隐私策略。 如您需继续使用本网站,请您授权我们使用本地 cookie 来保存部分信息。
全站搜索
您最值得信赖的光电行业旗舰网络服务平台!