光学 精密工程, 2020, 28 (3): 695, 网络出版: 2020-05-12   

多标签分类的传统民族服饰纹样图像语义理解

Multi-label classification of traditional national costume pattern image semantic understanding
作者单位
1 北京邮电大学 计算机学院, 北京 100876
2 北京邮电大学 数字媒体与设计艺术学院, 北京 100876
3 北京邮电大学 网络技术研究院, 北京100876
4 北京邮电大学 世纪学院, 北京 102101
摘要
针对当前图像多标签分类方法只关注图像本体类别信息(本体), 而忽略图像深层次语义信息(隐义)的问题, 本文提出了一种“本体-隐义”融合学习的图像多标签分类模型。该模型首先利用CNN中间层和较高层分别学习图像的本体信息和隐义信息, 然后利用本体信息与隐义信息之间的依赖关系设计了融合学习模型, 同时对提出模型的不同中间层特征和模型的不同结构进行了深入研究, 最终实现了对图像中多类别以及各类别蕴含的隐义信息分类。在传统民族服饰纹样图像数据集上进行实验, 得到图像本体多标签分类和隐义多标签分类的mAP分别为0.88和0.82; 在Scene数据集上进行对比实验, 本文模型在Hamming loss, One-error以及Average precision指标上分别优于其他最好方法0.103, 0091和0.083, 实验结果证明了本文方法的有效性和优越性。
Abstract
Since current image multi-label classification methods only focus on the category information of image ontology (ontology) and ignore the deep semantic information of the image (implicit), this study proposed an image multi-label classification model of “ontology-implicit” fusion learning. The model first used the middle and higher layers of CNN to learn the image ontology information and implicit information, respectively, and then it used the dependency relationship between the ontology information and implicit information to design the fusion learning model. Meanwhile, the different characteristics of the middle layer and different structures of the model were studied in-depth, to realize the classification of implicit information contained in multiple image categories. Experiments conducted on the traditional national costume pattern image datasets show that the mAP of image ontology multi-label classification and implicit multi-label classification are 0.88 and 0.82, respectively. Comparative experiments conducted on the Scene dataset show that the model is superior to other methods in Hamming loss, one error, and average precision indices, with values of 0.103, 0091, and 0.083, respectively. Therefore, the experimental results prove the effectiveness and superiority of this method.
参考文献

[1] 张坤华, 谭志恒, 李斌. 结合粒子群优化和综合评价的脉冲耦合神经网络图像自动分割 [J]. 光学 精密工程, 2018, 26(4): 962-970.

    ZHANG K H, TAN ZH H, LI B. Automated image segmentation based on pulse coupled neural network with partide swarm optimization and comprehensive evaluation [J]. Opt. Precision Eng., 2018, 26(4): 962-970. (in Chinese)

[2] 刘智, 黄江涛, 冯欣.构建多尺度深度卷积神经网络行为识别模型 [J].光学 精密工程, 2017, 25(3): 799-805.

    LIU ZH, HUANG J T, FENG X. Action recognition model construction based on multi-scale deep convolution neural network [J]. Opt. Precision Eng., 2017, 25(3): 799-805. (in Chinese)

[3] 李宇, 刘雪莹, 张洪群, 等.基于卷积神经网络的光学遥感图像检索 [J]. 光学 精密工程, 2018, 26(1): 200-207.

    LI Y, LIU X Y, ZHANG H Q, et al.. Optical remote sensing image retrieval based on convolutional neural networks [J]. Opt. Precision Eng. , 2018, 26(1): 200-207. (in Chinese)

[4] WEI Y, XIA W, LIN M, et al.. HCP: A flexible CNN framework for multi-label image classification [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 38(9): 1901-1907.

[5] WANG Z, CHEN T, LI G, et al.. Multi-label image recognition by recurrently discovering attentional regions [C]. Proceedings of the IEEE International Conference on Computer Vision (CVPR), 2017: 464-472.

[6] YU W J, CHEN ZH D, LUO X, et al.. DELTA: A deep dual-stream network for multi-label image classification [J].Pattern Recognition, 2019, 91: 322-331.

[7] YAN Z, LIU W W, WEN SH P, et al.. Multi-label image classification by feature attention network [J]. IEEE Access, 2019, 7: 98005-98013.

[8] SONG P, JING L P, et al.. Exploiting label relationships in multi-label classification with neural networks [J]. Journal of Computer Research and Development. 2018, 55(8): 1751-1759.

[9] JIN J, NAKAYAMA H. Annotation order matters: Recurrent image annotator for arbitrary length image tagging [C]. 2016 23rd International Conference on Pattern Recognition (ICPR), IEEE, 2016: 2452-2457.

[10] WANG J, YANG Y, MAO J, et al.. Cnn-rnn: A unified framework for multi-label image classification [C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016: 2285-2294.

[11] ZHAO B, LI X, LU X, et al.. A CNN-RNN architecture for multi-label weather recognition [J]. Neurocomputing, 2018, 322: 47-57.

[12] LYU F, HU F, SHENG V S, et al.. Coarse to fine: multi-label image classification with global/local attention [C]. 2018 IEEE International Smart Cities Conference (ISC2), IEEE, 2018: 1-7.

[13] ZEILER M D, FERGUS R. Visualizing and understanding convolutional networks [C]. European Conference on Computer Vision. Springer, Cham, 2014: 818-833.

[14] SZEGEDY C, VANHOUCKE V, IOFFE S, et al.. Rethinking the inception architecture for computer vision [C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016: 2818-2826.

[15] ZHOU ZH H, ZHANG M L. Multi-instance multi-label learning with application to scene classification [C]. Advances in Neural Information Processing Systems, 2007: 1609-1616.

[16] DONG M, PANG K, WU Y, et al.. Transferring CNNS to multi-instance multi-label classification on small datasets [C]. 2017 IEEE International Conference on Image Processing (ICIP), IEEE, 2017: 1332-1336.

[17] LI Y F, HU J H, JIANG Y, et al.. Towards discovering what patterns trigger what labels [C]. Twenty-Sixth AAAI Conference on Artificial Intelligence (AAAI), 2012: 1012-1018.

[18] HUANG S J, GAN W, ZHOU ZH H. Fast multi-instance multi-label learning [C]. Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence (AAAI), 2014: 1868-1874.

[19] WANG T Z, HUANG S J, ZHOU ZH H. Towards identifying causal relation between instances and labels [C]. Proceedings of the 2019 SIAM International Conference on Data Mining. Society for Industrial and Applied Mathematics, 2019: 289-297.

[20] LIU K, WANG H, NIE F P, et al.. Learning multi-instance enriched image representations via non-greedy ratio maximization of the L1-norm distances [C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018: 7727-7735.

赵海英, 周伟, 侯小刚, 齐光磊. 多标签分类的传统民族服饰纹样图像语义理解[J]. 光学 精密工程, 2020, 28(3): 695. ZHAO Hai-ying, ZHOU Wei, HOU Xiao-gang, QI Guang-lei. Multi-label classification of traditional national costume pattern image semantic understanding[J]. Optics and Precision Engineering, 2020, 28(3): 695.

本文已被 3 篇论文引用
被引统计数据来源于中国光学期刊网
引用该论文: TXT   |   EndNote

相关论文

加载中...

关于本站 Cookie 的使用提示

中国光学期刊网使用基于 cookie 的技术来更好地为您提供各项服务,点击此处了解我们的隐私策略。 如您需继续使用本网站,请您授权我们使用本地 cookie 来保存部分信息。
全站搜索
您最值得信赖的光电行业旗舰网络服务平台!