一种基于卷积神经网络的恒星光谱快速分类法

王楠楠; 邱波; 马杰; 石超君; 宋涛; 郭平

doi:doi:10.3964/j.issn.1000-0593(2019)10-3297-05

光谱学与光谱分析, 2019, 39 (10): 3297, 网络出版: 2019-11-05

一种基于卷积神经网络的恒星光谱快速分类法

Fast Classification Method of Star Spectra Data Based on Convolutional Neural Network

王楠楠 ^1,*邱波 ¹马杰 ¹石超君 ¹宋涛 ¹郭平 ²

作者单位

¹ 河北工业大学电子信息工程学院, 天津 300401

² 北京师范大学系统科学学院, 北京 100875

恒星光谱数据自动分类 5折交叉验证 Stellar spectral data Automatic classification CNN CNN 5-Cross-validation

摘要

恒星光谱数据的分类是天体光谱自动识别的最基本任务之一, 光谱分类的研究能够为恒星的演化提供线索。随着科技的发展, 天文数据也向大数据时代迈进, 需要处理的恒星光谱数量越来越多, 如何对其进行自动而精准地分类成为了天文学家要解决的难题之一。当前恒星光谱自动分类问题的解决方法相对较少, 为此本文使用了一种基于卷积神经网络的方法对恒星光谱MK系统进行分类。该网络由数据输入层、四个卷积层、四个池化层、全连接层、输出层构成, 与传统网络相比具有局部感知、参数共享等优点实验。在Python3.5的环境下编程, 利用Tensorflow构建了一个简单高效的具有四个卷积层的卷积神经网络, 并将Dropout作用于全连接层之后以防止过度拟合。 Dropout的基本思想: 当网络模型进行训练时, 把一些神经网络节点按一定的比例丢弃, 使其暂时不发挥作用。 Dropout可以理解成是一种十分高效的神经网络模型平均方法, 由于它不依赖于某些局部特征所以能够让网络模型更加鲁棒。实验中使用的一维恒星光谱图是取自LAMOST DR3数据库, 首先进行预处理截取光谱3 600~7 300 的部分, 均匀采样后使用min-max标准化法对其进行初始化。实验包括两部分: 第一部分为依据恒星光谱MK系统对光谱进行分类, 每一类的训练样本包含1 000条光谱数据, 测试样本为400条光谱数据, 首先通过训练样本对CNN网络进行训练, 进行3 000次的迭代, 用训练后的网络将测试样本进行分类以验证网络的准确性; 第二部分为相邻两类的恒星光谱的分类, 其中O型星数据集样本为250条光谱, 其余类别恒星样本数据集均为4 000条光谱, 将数据5等分, 每次选取当中的一份当作测试集, 其余部分当作训练集, 采用5折交叉验证法求得模型准确率, 用BP神经网络进行对比实验。选择对网络模型进行评估的指标包括精确率P、召回率R、 F-score、准确率A。实验结果显示CNN在对六类恒星光谱进行分类时其准确率都在95%以上, 在对相邻类别的恒星进行分类时, 由于O型星样本量较少, 所以得到的分类结果不太理想, 对其余类别的恒星分类准确率都高于98%, 以上结果都证明了CNN算法能够很好地解决恒星光谱的分类问题。

Abstract

Classification of stellar spectral data is one of the most basic tasks in automatic recognition of celestial spectra. The study of spectral classification can provide clues to the evolution of stars. With the development of science and technology, astronomical data are also moving towards the era of big data. The number of stars that need to be processed is increasing. How to classify them automatically and accurately has become one of the difficult problems that astronomers have to solve. At present, there are few methods to solve the problem of Star automatic classification. In this paper, a convolution neural network based method is used to classify star spectral MK system. The network is composed of data input layer, four convolution layers, four pooling layers, full connection layer and output layer. Compared with traditional network, it has the advantages of local perception and parameter sharing. In this paper, a simple and efficient convolution neural network with four convolution layers is constructed by Tensorflow in Python 3.5 environment. Dropout is applied to the full connection layer to prevent over fitting. Dropout’s basic idea: When the network model is trained, some neural network nodes are discarded in a certain proportion, so that they do not play a role temporarily. Dropout can be understood as a very efficient neural network model averaging method, because it does not depend on some local features, it can make the network model more robust. The one-dimensional star spectrogram used in the experiment was downloaded from the LAMOST DR3 database. First, the spectrum was intercepted by pretreatment. After uniform sampling, it was initialized by min-max standardization method. The experiment consists of two parts. The first part classifies the spectrum according to the star spectrum MK system. Each training sample contains 1 000 spectral data and 400 spectral data. First, the CNN network is trained by training samples, and then 3 000 iterations are carried out. Then, the test samples are divided into several parts by the trained network. The second part is the classification of adjacent two types of star spectra, in which the O-type star data set sample is 250 spectra, and the rest are 4 000 spectra. The data are divided into five parts, one of which is selected as test set each time, the rest as training set, using 5 fold crossover. The accuracy of the model was calculated by the verification method, and the BP neural network was used for comparative experiments. The indicators to evaluate the network model include accuracy rate P, recall rate R, F-score and accuracy rate A. The experimental results show that the classification accuracy of the six types of stars is more than 95%. When classifying the adjacent types of stars, the classification results are not ideal because of the small sample size of O type stars. The classification accuracy of the other types of stars is higher than 98%. All the above results prove that CNN algorithm can classify the stars. The classification of stellar spectra is well solved.

PDF全文

王楠楠, 邱波, 马杰, 石超君, 宋涛, 郭平. 一种基于卷积神经网络的恒星光谱快速分类法[J]. 光谱学与光谱分析, 2019, 39(10): 3297. WANG Nan-nan, QIU Bo, MA Jie, SHI Chao-jun, SONG Tao, GUO Ping. Fast Classification Method of Star Spectra Data Based on Convolutional Neural Network[J]. Spectroscopy and Spectral Analysis, 2019, 39(10): 3297.

一种基于卷积神经网络的恒星光谱快速分类法

关于本站 Cookie 的使用提示

全站搜索

一种基于卷积神经网络的恒星光谱快速分类法

相关论文

相关资讯

关于本站 Cookie 的使用提示

全站搜索