激光与光电子学进展, 2019, 56 (20): 201004, 网络出版: 2019-10-22
融合卷积神经网络与主题模型的图像标注 下载: 839次
Image Annotation Based on Convolutional Neural Network and Topic Model
图像处理 卷积神经网络 主题模型 图像标注 损失函数 多标签分类学习 image processing convolutional neural network topic model image annotation loss function multi-label classification learning
摘要
为降低图像文本数据的稀疏性和传统图像特征的局限性,提出一种融合卷积神经网络(CNN)和主题模型的图像标注算法。利用狄利克雷主题模型对图像训练集的文本数据进行建模,生成文本主题分布和文本主题标注词分布,以降低图像文本数据的维度和稀疏性。考虑到图像文本主题的稀疏分布,利用CNN提取图像的高层视觉特征,同时改进损失函数以重构CNN。利用图像的高层视觉特征和对应的多个文本主题构建多分类器,进行图像文本主题多标签分类学习,并获得图像的文本主题分布。最后,将该文本主题分布和主题模型生成的文本主题标注词分布融合计算出图像的标注词概率。由Corel5K和IAPR TC-12图像标注数据集的对比实验可知,本文方法有效提高了图像的标注性能。
Abstract
To address the issue of the sparsity of image text data and the limitation of traditional image features, this study proposes an image annotation algorithm that combines a convolutional neural network (CNN) and a topic model. Herein, a Dirichlet topic model is used to model text data on image training sets and generate text topic distribution and text topic label distribution, which reduces the dimension and sparsity of image text data. Considering the sparse distribution of image text topic, the CNN is used to extract high-level visual image features, and the loss function is improved to reconstruct the CNN. Multiple classifiers are constructed based on the high-level visual image features and corresponding multi-text topics to perform multi-label classification learning on image text topics and obtain the text-topic distribution of image. Finally, the text-topic distribution and text-topic label distribution are combined to calculate the probability of the image label. Based on the contrast experiment on Corel5K and IAPR TC-12 image annotation datasets, the proposed algorithm effectively improves the performance of image annotation.
张蕾, 蔡明. 融合卷积神经网络与主题模型的图像标注[J]. 激光与光电子学进展, 2019, 56(20): 201004. Lei Zhang, Ming Cai. Image Annotation Based on Convolutional Neural Network and Topic Model[J]. Laser & Optoelectronics Progress, 2019, 56(20): 201004.