改进的修剪随机森林算法在烟叶近红外光谱产地识别中的应用研究
Application of Improved Random Forest Pruning Algorithm in Tobacco Origin Identification of Near Infrared Spectrum
摘要
为了建立更准确、高效的烟叶产地识别模型,提出了基于自适应遗传算法的修剪随机森林算法(AGARFP)。该算法根据种群的进化程度,适配不同的选择算子;然后利用改进的自适应遗传算法对随机森林进行修剪。实验选择5个产区的样本构建烟叶产地识别模型,以产地识别准确率作为算法优劣的衡量标准。实验结果表明,AGARFP分类准确率为94.67%,分类效果优于其他方法,从而证明了所提算法的有效性。
Abstract
In order to establish a more accurate and efficient identification model of tobacco origin, a random forest pruning algorithm based on adaptive genetic algorithm (AGARFP) is proposed. According to evolution degree of groups, the proposed algorithm can adapt to different selection operators; then, by utilizing the improved adaptive genetic algorithm, random forest pruning can be conducted. The samples of five producing areas are selected to build an identification model for tobacco origin, the precision of origin identification is used as the standard to weigh the pros and cons of the algorithm. Experimental results show that the classification precision of AGARFP can be as high as 94.67%, the classification effects of AGARFP are superior to that of the comparative methods, thus the effectiveness of the proposed algorithm is demonstrated.
中图分类号:O433.4
所属栏目:光谱学
基金项目:国家科技支撑计划、国家重点研发计划;
收稿日期:2017-06-20
修改稿日期:--
网络出版日期:2018-01-01
作者单位 点击查看
丁香乾:中国海洋大学信息科学与工程学院, 山东 青岛 266100
宫会丽:中国海洋大学信息科学与工程学院, 山东 青岛 266100
联系人作者:宫会丽(huiligong@163.com)
备注:国家科技支撑计划、国家重点研发计划;
【3】Zhang C, Liu F, Kong W W, et al. Fast identification of watermelon seed variety using near infrared hyperspectral imaging technology [J]. Transactions of the Chinese Society of Agricultural Engineering. 2013, 29(20): 270-277.
张初, 刘飞, 孔汶汶, 等. 利用近红外高光谱图像技术快速鉴别西瓜种子品种 [J]. 农业工程学报. 2013, 29(20): 270-277.
【4】Yuan M Y, Huang B S, Yu C, et al. A NIR qualitative and quantitative model of 8 kinds of carbonate-containing mineral Chinese medicines [J]. China Journal of Chinese Materia Medica. 2014, 39(2): 267-272.
袁明洋, 黄必胜, 余驰, 等. 8种含碳酸盐的矿物类中药近红外定性定量模型的建立 [J]. 中国中药杂志. 2014, 39(2): 267-272.
【6】Hana M, Mcclure W, Whitaker T, et al. Applying artificial neural networks: Part Ⅱ. using near infrared data to classify tobacco types and identify native grown tobacco [J]. Journal of Near Infrared Spectroscopy. 1997, 5(1): 19-25.
【7】Shu R X, Sun P, Yang K, et al. NIR-PCA-SVM based pattern recognition of growing area of flue-cured tobacco Tobacco Science & Technology[J]. 0, 2011(11): 51-52.
束茹欣, 孙平, 杨凯, 等. 基于NIR-PCA-SVM联用技术的烤烟烟叶产地模式识别 烟草科技[J]. 0, 2011(11): 51-52.
【8】Shi F C, Li D L, Feng G L, et al. Discrimination of producing areas of flue-cured tobacco leaves with near infrared spectroscopy-based PLS-DA algorithm Tobacco Science & Technology[J]. 0, 2013(4): 56-59.
施丰成, 李东亮, 冯广林, 等. 基于近红外光谱的PLS-DA算法判别烤烟烟叶产地 烟草科技[J]. 0, 2013(4): 56-59.
【10】Li X H. Using "random forest" for classification and regression [J]. Chinese Journal of Applied Entomology. 2013, 50(4): 1190-1197.
李欣海. 随机森林模型在分类与回归分析中的应用 [J]. 应用昆虫学报. 2013, 50(4): 1190-1197.
【11】Li T, Ni B B, Wu X Y, et al. On random hyper-class random forest for visual classification [J]. Neurocomputing. 2016, 172(C): 281-289.
【12】Yang F, Lu W H, Luo L K, et al. Margin optimization based pruning for random forest [J]. Neurocomputing. 2012, 94(3): 54-63.
【13】Xu Y G, Zhang J Y, Gong X G, et al. A method of real-time traffic classification in secure access of the power enterprise based on improved random forest algorithm [J]. Power System Protection and Control. 2016, 44(24): 82-89.
许勇刚, 张建业, 龚小刚, 等. 基于改进随机森林算法的电力业务实时流量分类方法 [J]. 电力系统保护与控制. 2016, 44(24): 82-89.
【14】Qiu Y H. Customer-churn prediction for telecom enterprises based on pruning random forest [J]. Journal of Xiamen University (Natural Science). 2014, 53(6): 817-823.
邱一卉. 基于剪枝随机森林的电信行业客户流失预测 [J]. 厦门大学学报(自然科学版). 2014, 53(6): 817-823.
【15】Liu W Y, Liu B. Adaptive genetic algorithm based on co-evolution [J]. Computer Engineering and Applications. 2011, 47(14): 31-36.
刘文远, 刘彬. 基于协同进化的自适应遗传算法研究 [J]. 计算机工程与应用. 2011, 47(14): 31-36.
【16】Duan Y M, Xiao H H. Improved fruit fly algorithm for TSP problem [J]. Computer Engineering and Applications. 2016, 52(6): 144-149.
段艳明, 肖辉辉. 求解TSP问题的改进果蝇优化算法 [J]. 计算机工程与应用. 2016, 52(6): 144-149.
【17】Yu Y Y, Chen Y, Li T Y. Improved genetic algorithm for solving TSP [J]. Control and Decision. 2014, 29(8): 1483-1488.
于莹莹, 陈燕, 李桃迎. 改进的遗传算法求解旅行商问题 [J]. 控制与决策. 2014, 29(8): 1483-1488.
引用该论文
Kong Qingqing,Ding Xiangqian,Gong Huili. Application of Improved Random Forest Pruning Algorithm in Tobacco Origin Identification of Near Infrared Spectrum[J]. Laser & Optoelectronics Progress, 2018, 55(1): 013006
孔清清,丁香乾,宫会丽. 改进的修剪随机森林算法在烟叶近红外光谱产地识别中的应用研究[J]. 激光与光电子学进展, 2018, 55(1): 013006
被引情况
【1】徐宝鼎,丁香乾,秦玉华,侯瑞春,张磊. 基于网格划分局部线性嵌入算法的近红外光谱相似性度量方法. 激光与光电子学进展, 2019, 56(3): 33001--1
【2】张萌萌,刘以安,宋萍. 偏联系数聚类和随机森林算法在雷达信号分选中的应用. 激光与光电子学进展, 2019, 56(6): 62804--1
【3】郝勇,吴文辉,商庆园,耿佩. 山茶油中油酸和亚油酸近红外光谱分析模型. 光学学报, 2019, 39(9): 930004--1
【4】王淑贤,肖航,杨振发,姜明顺,隋青美,冯德军. 香精掺假普洱茶的近红外光谱检测. 激光与光电子学进展, 2020, 57(23): 233005--1
【5】金秀,朱先志,李绍稳,王文才,齐海军. 基于梯度提升树的土壤速效磷高光谱回归预测方法. 激光与光电子学进展, 2019, 56(13): 131102--1