光谱学与光谱分析, 2021, 41 (4): 1086, 网络出版: 2021-04-12  

基于稀疏子空间的类星体光谱异常特征并行提取与分析

Parallel Extraction and Analysis of Abnormal Features of QSO Spectra Based on Sparse Subspace
作者单位
太原科技大学计算机科学与技术学院, 山西 太原 030024
摘要
类星体是人类所观测到的最遥远天体, 对于了解早期宇宙的演化具有重要科学意义。 由于类星体距离地球较远, 其红移一般较大, 导致在光学观测窗口中只有很少的特征(发射线), 且难以识别。 类星体光谱的异常特征提取与分析可对未知类星体的识别, 提供有效的判别依据。 离群检测作为数据挖掘领域的一个主要研究内容, 旨在发现那些稀有、 特殊数据对象及异常特征, 可作为从海量类星体光谱数据中, 发现特殊、 未知类星体的一种有效途径和手段。 Spark作为新一代大数据分布式处理框架, 可为海量天体光谱的有效分析和处理, 提供一个高效且可靠的并行编程平台。 本文充分利用集群系统和Spark编程模型的强大数据处理能力, 提出一种基于稀疏子空间的类星体光谱异常特征并行提取与分析方法, 其工作由三个模块组成, 即类星体光谱特征约减、 类星体光谱的稀疏子空间构造和搜索、 类星体光谱异常特征提取并行算法设计与分析。 类星体光谱特征约减模块, 通过属性相关性分析来识别呈现聚类结构的类星体光谱特征线, 这些特征线通常会聚集在稠密区域且对类星体光谱异常特征检测毫无意义。 光谱特征约减旨在运行异常特征检测算法之前剪枝类星体光谱的冗余特征线, 缩小光谱数据检测范围。 类星体光谱的稀疏子空间构造和搜索模块, 通过设定的稀疏系数阈值来测量类星体光谱的子空间密度, 并采用粒子群优化方法作为稀疏子空间的搜索策略, 从而快速、 高效地获取类星体的异常特征。 在第三个模块中, 提出了一种MapReduce框架下的类星体光谱异常数据并行检测算法, 该算法由并行化数据约减策略、 稀疏子空间并行搜索技术两个MapReduce构成, 达到适应海量光谱数据的处理目标。 最后对检测出的部分类星体异常特征进行了理论分析、 测量及人眼证认, 充分说明稀疏子空间可为识别特殊、 未知类星体候选源, 提供有效支持和有力证据。
Abstract
Quasi-Stellar Object (QSO), the most distant celestial body observed by humans, has important scientific value for the universe evolution.Quasars are far away from the earth, and their redshift values are large, which results in few features appearing in the optical observation window. Hence, constructing a QSO template is a difficult task, and then making the automatic identification of QSO become an urgent problem. The abnormal characteristics extraction and analysis of QSO spectra are helpful to solve the above problems, there by further providing strong evidence for exploring the mysteries of the universe. The outlier detection method, one of the main research contents in the data mining field, can detect rare data objects and anomalous characteristics from massive size data. Therefore, outlier detection can facilitate novel schemes for identifying rare QSOs and achieving validation. As a new generation of big data distributed processing framework, Spark provides an efficient, easy-to-implement and reliable parallel programming platform for analyzing and processing massive celestial spectra. The overarching goal of this paper is to investigate parallel detection methods based on sparse-subspace for QSO anomalous characteristics. We aim to optimize the performance of parallel abnormal detection through the virtue of the high-performance data processing capacity of the Spark programming model on clusters. This research embraces the following three modules, namely, QSO spectral feature reduction, sparse-subspace construction and search of QSO spectral data, and parallel algorithm design and analysis of QSO abnormal characteristics extraction. The QSO spectral feature reduction module exhibits superb performance in speeding up abnormal characteristic’s detection efficiency by the attribute correlation analysis. Specifically, some spectral feature lines with clustering structure are identified, which are usually concentrated in dense regions and are meaningless for detecting anomalous spectral features. The module aims to prune the redundant feature lines so as to narrow the search range of abnormal quasars. The second module is the sparse-subspace construction and search module, which extends the particle swarm optimization method to search sparse subspaces so as to obtain the anomalous features quickly. At the heart of this module is the determination of the sparse-subspace that contains QSO spectra anomalous features, where the subspace density of QSO spectra is measured by a threshold of sparse coefficients. In the third module, a parallel detection algorithm for abnormal spectral data under the MapReduce framework is proposed. The algorithm consists of two MapReduce: parallel data reduction strategy and sparse-subspace parallel search technique. Finally, the detectedanomalous features of some QSOs are analyzed, measured and verified by human eyes, which fully demonstrates that the sparse-subspace can provide effective support and strong evidence for identifying candidate sources of special and unknown QSOs.
参考文献

[1] Luo A L, Zhao Y H, Zhao G, et al. Research in Astronomy and Astrophysics, 2015, 15(8): 1095.

[2] Liu X W, Zhao G, Hou J L. Research in Astronomy and Astrophysics, 2015, 15(8): 1089.

[3] Logan C H A, Fotopoulou S. Astronomy & Astrophysics, 2020, 633: A154.

[4] Lawrence, Andy. Nature Astronomy, 2018, 2(2): 102.

[5] Makhija S, Saha S, Basak S, et al. Astronomy and Computing, 2019, 29: 100313.

[6] Rubinur K, Das M, Kharb P, et al. Monthly Notices of the Royal Astronomical Society, 2017, 465(4): 4772.

[7] Yang Y, Cai J, Yang H, et al. Expert Systems with Applications, 2020, 139: 112846.

[8] Zhao Xujun, Rao Yuanqi, Cai Jianghui, et al. IEEE Access, 2020, 8: 29987.

[9] QU Cai-xia, YANG Hai-feng, CAI Jiang-hui, et al(屈彩霞, 杨海峰, 蔡江辉, 等). Spectroscopy and Spectral Analysis (光谱学与光谱分析), 2020, 40(4): 1304.

[10] Cheng L T, Zhang F H. Research in Astronomy and Astrophysics, 2020, 20(9): 148.

[11] Li L J, Qian S B, Zhang J, et al. Research in Astronomy and Astrophysics, 2020, 20(6): 94.

[12] Sun S P, Liao S H, Guo Q, et al. Research in Astronomy and Astrophysics, 2020, 20(4): 21.

[13] Frew D J, Parker Q A, Bojiic′ I S. Monthly Notices of the Royal Astronomical Society, 2016, 455(2): 1459.

马洋, 张继福, 蔡江辉, 杨海峰, 赵旭俊. 基于稀疏子空间的类星体光谱异常特征并行提取与分析[J]. 光谱学与光谱分析, 2021, 41(4): 1086. MA Yang, ZHANG Ji-fu, CAI Jiang-hui, YANG Hai-feng, ZHAO Xu-jun. Parallel Extraction and Analysis of Abnormal Features of QSO Spectra Based on Sparse Subspace[J]. Spectroscopy and Spectral Analysis, 2021, 41(4): 1086.

关于本站 Cookie 的使用提示

中国光学期刊网使用基于 cookie 的技术来更好地为您提供各项服务,点击此处了解我们的隐私策略。 如您需继续使用本网站,请您授权我们使用本地 cookie 来保存部分信息。
全站搜索
您最值得信赖的光电行业旗舰网络服务平台!