光谱学与光谱分析, 2020, 40 (7): 2267, 网络出版: 2020-12-05   

基于SPA和多分类SVM的紫外-可见光光谱饮用水有机污染物判别方法研究

Classification of Organic Contaminants in Water Distribution Systems Developed by SPA and Multi-Classification SVM Using UV-Vis Spectroscopy
作者单位
浙江大学控制科学与工程学院, 工业控制技术国家重点实验室, 浙江 杭州 310027
摘要
快速、 有效地识别饮用水中污染物类别对于降低突发饮用水污染事件影响十分重要。 目前基于紫外-可见光光谱法的饮用水污染物判别模型大多使用主成分分析(PCA)进行特征提取, 然而, 对于光谱相似度较高的有机污染物, 仅从数据驱动的角度提取其方差最大的方向作为特征进行识别效果往往不佳。 针对有机污染物光谱数据多重共线性以及谱峰重叠干扰的问题, 开展了基于连续投影算法(SPA)和多分类支持向量机(M-SVM)的紫外-可见光光谱饮用水有机污染物判别方法研究。 首先, 使用紫外光谱仪测量苯酚、 对苯二酚、 间苯二酚和间苯二胺的原始光谱数据并进行预处理, 在对四种污染物进行了波长与浓度的相关关系对比分析后, 发现苯酚和间苯二酚、 对苯二酚和间苯二胺的谱峰重叠较为严重; 特征提取时, 引入SPA筛选有机污染物紫外-可见光光谱数据的特征波长组合, 并对不同波长个数时的光谱吸光度进行多元线性回归分析, 选取对应最小预测标准偏差的参数及波段组合作为最优参数组合; 基于最优特征波长组合, 构建基于多分类SVM的饮用水有机污染物分类识别模型; 最后, 对比分析了全光谱、 PCA及SPA特征提取后的光谱数据在不同分类方法及不同污染物浓度下的分类效果, 进一步说明了SPA的适用性和稳定性。 实验结果表明, SPA作为一种提取光谱数据原始特征波段的方法, 可以有效的对有机污染物的紫外-可见光光谱进行特征提取, 提升不同物质之间的差异, 在一定程度上消除多重共线性和谱峰重叠干扰, 从而提高分类模型的准确率。 该方法对于解决饮用水中谱峰重叠的污染物类型判别问题具有参考价值。
Abstract
Quickly and effectively identifying the water contaminants is vital for reducing the impact of sudden drinking water pollution incidents. PCA is mostly used to extract the feature of different contaminants in drinking water with UV-Vis spectra. However, for the organic contaminants with high similarity in UV-Vis spectra, the identification result is ineffective when only extracting the feature of the largest variance direction from the data-driven point of view. This paper studies the classification of organic contaminants in water distribution systems developed by SPA and multi-classification SVM using UV-Vis spectroscopy. Firstly, the original spectral data of phenol, hydroquinone, resorcinol and m-phenylenediamine are measured by UV spectrometer and pretreated. The correlation between wavelength and concentration of four contaminants was compared. The peaks between phenol and resorcinol, hydroquinone and m-phenylenediamine are overlapped seriously, the classification results can interfere easily. In feature extraction, the SPA is introduced to select the organic contaminants’ characteristic wavelengths of UV-Vis spectra. Then, multiple linear regression analysis is carried out to choose the optimal parameter combination, which corresponds to the minimum prediction standard deviation. Based on this, the multi-classification support vector machine is used to form an identification model for drinking water organic contaminants. Finally, the classification results of spectral data based on full spectrum, PCA and SPA under different classification methods and different concentrations are compared and analyzed, and the applicability and stability of SPA are further explained. Experimental results demonstrate that SPA-based feature extraction method eliminates the interference of multi-collinearity and amplifies the difference among the UV-Vis spectra of different organic contaminants, thereby improving the accuracy of the classification model. This method has certain reference value for solving the problem of identifying the types of pollutants with overlapped peaks in the drinking water.

黄平捷, 李宇涵, 俞巧君, 王柯, 尹航, 侯迪波, 张光新. 基于SPA和多分类SVM的紫外-可见光光谱饮用水有机污染物判别方法研究[J]. 光谱学与光谱分析, 2020, 40(7): 2267. HUANG Ping-jie, LI Yu-han, YU Qiao-jun, WANG Ke, YIN Hang, HOU Di-bo, ZHANG Guang-xin. Classification of Organic Contaminants in Water Distribution Systems Developed by SPA and Multi-Classification SVM Using UV-Vis Spectroscopy[J]. Spectroscopy and Spectral Analysis, 2020, 40(7): 2267.

本文已被 2 篇论文引用
被引统计数据来源于中国光学期刊网
引用该论文: TXT   |   EndNote

相关论文

加载中...

关于本站 Cookie 的使用提示

中国光学期刊网使用基于 cookie 的技术来更好地为您提供各项服务,点击此处了解我们的隐私策略。 如您需继续使用本网站,请您授权我们使用本地 cookie 来保存部分信息。
全站搜索
您最值得信赖的光电行业旗舰网络服务平台!