首页 > 论文 > 光谱学与光谱分析 > 34卷 > 4期(pp:947-951)

光谱多元分析校正集和验证集样本分布优选方法研究

An Optimal Selection Method of Samples of Calibration Set and Validation Set for Spectral Multivariate Analysis

  • 摘要
  • 论文信息
  • 参考文献
  • 被引情况
  • PDF全文
分享:

摘要

分析了校正集和验证集样品数随性质分布不均匀性对光谱多元分析校正的不良影响, 揭示了实际光谱多元校正中“均值化”现象, 即性质值小的样本预测值结果偏大, 性质值大的则偏小, 提出了一种优选样品新方法—Rank-KS。 其综合考虑光谱空间和性质空间对样本进行挑选, 将性质空间平均分为若干小区间, 在每个小区间内分别利用Kennard-Stone法和随机法进行校正集和验证集样本的挑选, 这样得到的校正集和验证集可明显改善样本数随性质分布的均匀性。 以红外光谱测定汽油中碳酸二甲酯(DMC)含量和近红外光谱测定二甲亚砜溶液二甲亚砜浓度为研究对象, 分别采用Rank-KS、 随机法、 Kennard-Stone、 浓度梯度法和SPXY等方法选择校正集和验证集样品, 使用多元线性回归和偏最小二乘法建立模型, 比较这些方法对光谱多元校正分析的影响, 结果表明Rank-KS方法可改善校正集和验证集样品数随性质分布的均匀性; 对于样本数分布中间局部样本多和两端局部少、 或者局部没有样本的样本集, 使用Rank-KS算法挑选校正集, 无论使用MLR还是PLS1建立多元分析模型, 均能明显改善其模型预测能力, 使得到的模型的预测均方根最小。

Abstract

The side effects in spectral multivariate modeling caused by the uneven distribution of sample numbers in the region of the calibration set and validation set were analyzed, and the “average” phenomenon that samples with small property values are predicted with larger values, and those with large property values are predicted with less values in spectral multivariate calibration is showed in this paper. Considering the distribution feature of spectral space and property space simultaneously, a new method of optimal sample selection named Rank-KS is proposed. Rank-KS aims at improving the uniformity of calibration set and validation set. Y-space was divided into some regions uniformly, samples of calibration set and validation set were extracted by Kennard-Stone(KS) and Random-Select(RS) algorithm respectively in every region, so the calibration set was distributed evenly and had a strong presentation. The proposed method were applied to the prediction of dimethylcarbonate (DMC) content in gasoline with infrared spectra and dimethylsulfoxide in its aqueous solution with near infrared spectra. The “average” phenomenon showed in the prediction of multiple linear regression (MLR) model of dimethylsulfoxide was weakened effectively by Rank-KS. For comparison, the MLR models and PLS1 models of MDC and dimethylsulfoxide were constructed by using RS, KS, Rank-Select, sample set partitioning based on joint X- and Y-blocks (SPXY) and proposed Rank-KS algorithms to select the calibration set, respectively. Application results verified that the best prediction was achieved by using Rank-KS. Especially, for the distribution of sample set with more in the middle and less on the boundaries, or none in the local, prediction of the model constructed by calibration set selected using Rank-KS can be improved obviously.

Newport宣传-MKS新实验室计划
补充资料

中图分类号:O657.3

DOI:10.3964/j.issn.1000-0593(2014)04-0947-05

基金项目:国家科技支撑计划课题(2011BAE11B00), 国家(863计划)项目(2009AA04Z135), 国家自然科学基金项目(60974065)资助

收稿日期:2013-07-01

修改稿日期:2013-10-15

网络出版日期:--

作者单位    点击查看

刘伟:北京化工大学信息科学与技术学院, 北京 100029
赵众:北京化工大学信息科学与技术学院, 北京 100029
袁洪福:北京化工大学材料科学与工程学院, 北京 100029
宋春风:北京化工大学材料科学与工程学院, 北京 100029
李效玉:北京化工大学材料科学与工程学院, 北京 100029

联系人作者:刘伟(liuwei_email@qq.com)

备注:刘伟, 1988年生, 北京化工大学信息科学与技术学院硕士研究生

【1】Daszykowski M, Walczak B, Massart D L. Analytical Chimica Acta, 2002, 468: 91.

【2】Gabriel G Siano, Héctor C Goicoechea. Chemometrics and Intelligent Laboratory Systems, 2007, 88: 204.

【3】YUAN Hong-fu, CHU Xiao-li, TIAN Gao-you, et al (袁洪福, 褚小立, 田高友, 等). Standard Guidelines for Molecular Spectroscopy Multivariate Calibration Quantitative Analysis(分析光谱多元校正定量分析通则). National Standard(中华人民共和国国家标准).

【4】Kanduc K R, Zupan J, Majcen N. Chemometrics and Intelligent Laboratory Systems, 2003, 65(2): 221.

【5】Kennard R W, Stone L A. Technometrics, 1969, 11: 137.

【6】Snee R D. Technometrics, 1977, 19(4): 415.

【7】WU Jing-zhu(吴静珠). Research of NIR-Based Technology on Agriculture Products Detection(农产品品质检测中的近红外光谱分析技术研究). Beijing: China Agricultural University(北京: 中国农业大学), 2006.

【8】Roberto Kawakami Harrop Galvo, Mário César Ugulino Araujb, Gledson Emídio José, et al. Talanta, 2005, 67: 736.

【9】Christian Hakemeyera, Ulrike Straussa, Silke Werza, et al. Talanta, 2012, 15: 12.

【10】XIE Jun, PAN Tao, CHEN Jie-mei, et al(谢军, 潘涛, 陈洁梅, 等). Chinese Journal of Analytical Chemistry(分析化学), 2010, 38(3): 342.

您的浏览器不支持PDF插件,请使用最新的(Chrome/Fire Fox等)浏览器.或者您还可以点击此处下载该论文PDF