光谱学与光谱分析, 2021, 41(2): 624, 网络出版: 2021-04-08

基于XGBoost的铝合金LIBS光谱分类识别方法

Study on Identification Method Based on XGBoost Model for Aluminum Alloy Using Laser-Induced Breakdown Spectroscopy
作者单位
摘要
铝合金作为重要的金属材料, 广泛应用于各领域, 但大量的铝合金废料却难以进行分类回收。 二次资源的回收利用是我国工业绿色、 可持续发展的助推器, 如何快速、 简便地对铝合金废料进行识别分类则成为了铝合金废料回收利用的先决条件。 激光诱导击穿光谱(LIBS)是近年来发展快速的一种分析技术, 具有快速、 全元素分析、 实时、 原位、 远距离检测等优点, 已广泛应用于塑料、 土壤、 肉类、 钢铁等识别研究, 大多采用最小二乘判别分析法、 簇类独立软模式、 人工神经网络、 支持向量机、 随机森林等算法来建立模型。 基于迭代型树的XGBoost算法具有正则化、 并行处理运算、 内置交叉验证和高度的算法灵活性等优势, 其模型结构相对简单、 运算量较小, 且准确率较高, 成为近年来机器学习中极受欢迎的算法, 因而被广泛应用。 基于六种铝合金样品的600组光谱数据, 根据NIST原子发射光谱数据库进行光谱特征提取, 确定光谱特征谱线的分类依据。 利用XGBoost算法进行自动分类及排序, 将处理后的光谱数据随机划分为训练集和测试集, 通过训练集构建算法模型, 提取其分类特征; 利用测试集检验模型的稳定性和可用性, 防止出现过拟合。 XGBoost在固定参数下得到的模型具有一定的自适应性, 较少受数据集的影响, 总体准确率可达96.67%。 其分类特征与已知的元素含量信息相吻合, 证明了基于光谱的特征谱线数据, 可为分类识别提供参考; 同时还可根据XGBoost生成的特征评分来对光谱谱线特征的重要性进行排序。 实验结果表明, LIBS可用于不同种类铝合金的快速识别, 为废弃金属的分类回收提供了一种新的技术。
Abstract
As an important metal material, aluminum alloy is widely used in various fields, but a large amount of aluminum alloy waste is difficult to sort and recycle. The recycling of aluminum alloy resources is a booster for China’s industrial green and sustainable development. How to quickly and easily identify and classify aluminum alloy waste has become a prerequisite for re-utilization. Laser-induced breakdown spectroscopy (LIBS) is an analytical technique that has developed rapidly in recent years. It has the advantages of fast, full-element analysis, real-time, in-situ, and long-distance detection. It has been widely used in plastics, soil, meat, steel, etc. For recognition research, most of them use the PLS-DA, SIMCA, ANN, SVM, Random Forest and other algorithms to build models. XGBoost algorithm has the advantages of regularization, parallel processing, built-in cross-validation, and high algorithm flexibility. Its model structure is relatively simple; it has a small amount of calculation and superior accuracy. It has become extremely popular in machine learning in recent years. Based on 600 sets of spectral data of six aluminum alloy samples, model extracts spectral features through machine learning to determine the classification. The processed spectral data is randomly divided into a training set, and a test set, and the XGBoost algorithm based on Decision Tree is used for automatic classification and sorting An algorithm model is constructed through the training set and its classification features are extracted; the test set is used to check the stability and usability of the model to qrevent over-fitting. The model obtained by XGBoost under fixed parameters has certain self-adaptability, is less affected by the data set, and the overall accuracy rate can reach 96.67%. Its classification characteristics are consistent with the known element content information, which proves that the characteristic spectral line data based on big data can provide a reference for classification identification; the importance of spectral line features can be ranked according to the feature score generated by XGBoost. Experimental results show that LIBS can be used for rapid identification of different types of aluminum alloys, and provides a new technology for the classification and recovery of waste metals.
00 11