光谱学与光谱分析, 2022, 42 (7): 2113, 网络出版: 2022-11-16  

分散式农村污水基于三维荧光光谱和紫外-可见全波段吸收光谱的“聚类-回归”COD预测模型

The “Cluster-Regression” COD Prediction Model of Distributed Rural Sewage Based on Three-Dimensional Fluorescence Spectrum and Ultraviolet-Visible Absorption Spectrum
作者单位
1 上海交通大学中英国际低碳学院, 上海 201306
2 上海交通大学环境科学与工程学院, 上海 200240
摘要
基于三维荧光光谱与有机物特征荧光峰之间的关系, 提出利用三维荧光光谱进行聚类, 再针对不同类的水样利用紫外-可见全波段吸收光谱数据建立COD预测模型的技术路线。 比较分析了平行因子分析(PARAFAC)算法和荧光体积积分(FRI)算法两种不同的光谱分析方法, 再使用模糊c-均值(FCM)算法进行聚类, 并完成了不同类水样的COD预测模型的建立。 研究的水样采集于江苏省常熟市周边的农村区域, 样品均来自不同的分散式农村生活污水处理装置出水, 共100个实验水样; 将测得的水样三维荧光光谱数据经过去散射预处理后利用PARAFAC算法和FRI算法分别提取荧光特征数据; 之后, 利用FCM聚类算法进行相似性聚类; 最后, 利用偏最小二乘(PLS)算法建立水样的紫外-可见全波段吸收光谱和COD之间的回归和预测模型, 并使用决定系数和均方根误差对模型的预测精度进行评价。 研究结果表明: 未分类、 使用FRI、 使用PARAFAC算法提取荧光特征信息后再预测的模型的平均决定系数R2分别为0.632, 0.819和0.906; 平均均方根误差RMSE分别为27.857, 23.621和13.071。 聚类后的回归和预测精度均得到显著提升, 且使用PARAFAC算法提取荧光特征信息后再建模具有最高的预测精度, 相比于未分类预测模型的R2提高了0.274。 本研究提出的基于三维荧光光谱联合紫外可见全波段吸收光谱, 采用“PARAFAC-FCM-PLS”组合算法构建的COD预测模型, 可以有效的提高COD的预测精度, 为高精度的水质在线监测提供了一种新的思路。
Abstract
Based on the relationship between the three-dimensional fluorescence spectrum and the characteristic fluorescence peaks of organic matter, this study proposed to use the three-dimensional fluorescence spectrum for clustering and then for different kinds of water samples, using UV-Vis full-band absorption spectrum data to establish the COD prediction model technical route. The parallel factor analysis (PARAFAC) algorithm and fluorescence volume integration (FRI) algorithm were compared and analyzed, and then the fuzzy c-means(FCM) algorithm was used for clustering, and the COD prediction model of different water samples was established. The water samples in this study were collected from the rural areas around Changshu City, Jiangsu Province, and 100 experimental water samples were collected from the effluent of different distributed rural domestic sewage treatment plants. The measured three-dimensional fluorescence spectrum of water samples was pretreated by de-scattering, and then the fluorescence characteristic data were extracted by the PARAFAC algorithm and FRI algorithm, respectively. Then, the FCM clustering algorithm was used for similarity clustering. Finally, the partial least squares (PLS) algorithm was used to establish the regression and prediction model between the UV-Vis full-band absorption spectrum and COD of water samples, and the prediction accuracy was evaluated by the coefficient of determination and the root mean square error(RMSE). The results showed that the prediction models’ mean determination coefficients(R2) were 0.632, 0.819 and 0.906, respectively, after the fluorescence feature information was extracted using FRI and PARAFAC algorithms. RMSE were 27.857, 23.621 and 13.071, respectively. The regression and prediction accuracy was significantly improved after clustering, and the modeling established after the extraction of fluorescence feature information using the PARAFAC algorithm had the highest prediction accuracy, which was 0.274 higher than theR2 of the unclassified prediction model. The proposed COD prediction model based on a three-dimensional fluorescence spectrum combined with UV-Vis full-band absorption spectrum and using the combined algorithm of “PARAFAC-FCM-PLS” can effectively improve the prediction accuracy of COD and provide a new idea for high precision online monitoring of water quality.

周铭睿, 曲江北, 李彭, 何义亮. 分散式农村污水基于三维荧光光谱和紫外-可见全波段吸收光谱的“聚类-回归”COD预测模型[J]. 光谱学与光谱分析, 2022, 42(7): 2113. Ming-rui ZHOU, Jiang-bei QU, Peng LI, Yi-liang HE. The “Cluster-Regression” COD Prediction Model of Distributed Rural Sewage Based on Three-Dimensional Fluorescence Spectrum and Ultraviolet-Visible Absorption Spectrum[J]. Spectroscopy and Spectral Analysis, 2022, 42(7): 2113.

引用该论文: TXT   |   EndNote

相关论文

加载中...

关于本站 Cookie 的使用提示

中国光学期刊网使用基于 cookie 的技术来更好地为您提供各项服务,点击此处了解我们的隐私策略。 如您需继续使用本网站,请您授权我们使用本地 cookie 来保存部分信息。
全站搜索
您最值得信赖的光电行业旗舰网络服务平台!