光谱学与光谱分析, 2018, 38 (1): 31, 网络出版: 2018-01-30   

基于互信息的遗传算法在光谱谱段选择中应用

Research on Genetic Algorithm Based on Mutual Information in the Spectrum Selection
作者单位
中国海洋大学信息科学与工程学院, 山东 青岛 266100
摘要
在近红外光谱分析技术中, 建立一个准确、 稳健的定量模型至关重要。 全光谱建模会增加建模和预测时间, 降低模型的稳健性和预测精度, 因此有效的变量选择方法对于模型构建至关重要。 针对该问题, 提出了基于互信息的遗传算法(GAs-MI)对特征变量进行选择, 互信息筛选掉大量无关信息和冗余信息, 遗传算法进一步选择出高辨别力的特征; 并在遗传算法的变异过程中引入Shapley值方法, 减少了人为设定参数的随机性。 为了验证算法的有效性, 选取有代表性的273个烟叶样本为实验材料, 随机选择其中182个样本实现对烟叶总烟碱的PLS定量建模, 剩余样本作为测试集, 以相关系数(R)、 交互验证均方差(RMSECV)和预测均方根误差(RMSEP)为模型评价指标。 实验结果表明, 通过该方法选择的波长建立的模型更加简单、 预测能力更强。
Abstract
It is vital to establish an accurate and robust quantitative model in near-infrared spectroscopy. The whole spectrum modeling can increase the computational time of modeling and forecasting, and reduce the robustness and precision. Therefore the effective variable selection method is very important for model construction. To address this problem, this paper proposed a genetic algorithm based on mutual information (GAs-MI) to select features. Mutual information filtered out a large number of unrelated information and redundant information. Genetic algorithm further selected the features with high discernment. Shapley value method was introduced to reduce the randomness of artificial setting parameters in the mutation process of genetic algorithm. In order to validate the validity of the algorithm, 273 representative tobacco samples were selected as the experimental materials. 182 samples were randomly selected to construct the PLS quantitative model of tobacco nicotine,and the remaining samples were used as the test set. The Correlation Coefficient (R), the Root Means Square Error of Cross Validation (RMSECV) and the Root Mean Square Error of Prediction (RMSEP) were used as the model evaluation indexes. The experimental results showed that the model established by the selected wavelength was simpler and more predictive.
参考文献

[1] CHU Xiao-li, LU Wan-zhen(褚小立, 陆婉珍). Spectroscopy and Spectral Analysis(光谱学与光谱分析), 2014, 34(10): 2595.

[2] SUN Tong, WU Yi-qing, LI Xiao-zhen, et al(孙 通, 吴宜青, 李晓珍, 等). Acta Optica Sinica(光学学报), 2015, 35(16): 0630005.

[3] ZHANG Long, PAN Jia-rong, ZHU Cheng(张 龙, 潘家荣, 朱 诚). Food Science(食品科学), 2013, 34(6): 167.

[4] SONG Sha-lei, LI Ping-xiang, GONG Wei, et al(宋沙磊, 李平湘, 龚 威, 等). Geomatics and Information Science of Wuhan University(武汉大学学报·信息科学版), 2010, 35(2): 219.

[5] SHI Ji-yong, ZOU Xiao-bo, ZHAO Jie-wen, et al(石吉勇, 邹小波, 赵杰文, 等). Journal of Infrared and Millimeter Waves(红外与毫米学报), 2011, 30(5): 458.

[6] CHENG Biao, CHEN De-zhao, WU Xiao-hua(成 飙, 陈德钊, 吴晓华). Chinese Journal of Analytical Chemistry(分析化学), 2006, 34(9): 123.

[7] ZHU Shi-ping, WANG Yi-ming, ZHANG Xiao-chao, et al(祝诗平, 王一鸣, 张小超, 等). Transactions of The Chinese Society of Agricultural Machinery(农业机械学报), 2004, 35(5): 152.

[8] ZOU Xiao-bo, ZHAO Jie-wen(邹小波, 赵杰文). Acta Optica Sinica(光学学报), 2007, 27(7): 1316.

[9] TANG Shi-wei, LIU Xian-mei(唐世伟, 刘贤梅). Information Theory(信息论). Harbin: Harbin Engineering University Press(哈尔滨: 哈尔滨工业大学出版社), 2009.

[10] FAN Xue-li, FENG Hai-hong, YUAN Meng(范雪莉, 冯海泓, 原 猛). Control and Decision(控制与决策), 2013, 28(6): 915.

[11] Bezalel Peleg, Peter Sudholter. Introduction on the Theory of Cooperative Games 2nd ed. Springer, 2007.

[12] Mojtaba Sadegh, Najmeh Mahjouri, Reze Kerachian. Water Resour Manage, 2010, 24(10): 2291.

孔清清, 宫会丽, 丁香乾, 刘明. 基于互信息的遗传算法在光谱谱段选择中应用[J]. 光谱学与光谱分析, 2018, 38(1): 31. KONG Qing-qing, GONG Hui-li, DING Xiang-qian, LIU Ming. Research on Genetic Algorithm Based on Mutual Information in the Spectrum Selection[J]. Spectroscopy and Spectral Analysis, 2018, 38(1): 31.

本文已被 1 篇论文引用
被引统计数据来源于中国光学期刊网
引用该论文: TXT   |   EndNote

相关论文

加载中...

关于本站 Cookie 的使用提示

中国光学期刊网使用基于 cookie 的技术来更好地为您提供各项服务,点击此处了解我们的隐私策略。 如您需继续使用本网站,请您授权我们使用本地 cookie 来保存部分信息。
全站搜索
您最值得信赖的光电行业旗舰网络服务平台!