首页 > 论文 > 激光与光电子学进展 > 56卷 > 2期(pp:21002--1)

基于改进多尺度特征图的目标快速检测与识别算法

Fast Object Detection and Recognition Algorithm Based on Improved Multi-Scale Feature Maps

  • 摘要
  • 论文信息
  • 参考文献
  • 被引情况
  • PDF全文
分享:

摘要

针对目标检测与识别在精度和实时性方面的要求, 提出了一种基于改进多尺度特征图的目标快速检测与识别算法。算法在原始SSD模型的基础上, 利用卷积神经网络自动提取多尺度特征图, 构建了一种有效的卷积特征图融合模块, 同时引入轻量级的压缩型双线性融合方法, 丰富上下文信息。进一步结合通道注意机制, 自适应地学习特征图各通道之间的相互关系, 强调有用信息, 抑制冗余信息, 提高了特征图的判别能力, 将增强后的多尺度特征图用于检测模型。实验结果表明, 与同类算法相比, 所提算法的效率更高, 明显提升了识别精度, 同时速度达到63 frame·s-1, 较好地平衡了识别精度与速度之间的关系。

Abstract

Aiming at the precision and real-time requirements of object detection and recognition, a fast object detection and recognition algorithm is proposed based on the improved multi-scale feature maps. In this algorithm, the convolutional neural network is adopted to automatically extract the multi-scale feature maps based on the original SSD model, and a fusion module of effective convolutional feature maps is constructed. Meanwhile, the lightweight compact bilinear fusion method is introduced to enrich the context information. Moreover, through the channel attention mechanism, the relationship among all channels of feature maps is adaptively learned. The useful information is emphasized, in contrast the redundant information is suppressed. Thus the discriminability of feature maps is boosted. The enhanced feature maps are then used in the detection model. The experimental results show that the proposed algorithm is more efficient than the other similar algorithms and the recognition precision is obviously improved, meanwhile, the speed of proposed algorithm is up to 63 frame·s-1. The recognition precision and speed are well balanced.

Newport宣传-MKS新实验室计划
补充资料

中图分类号:TP391

DOI:10.3788/lop56.021002

所属栏目:图像处理

基金项目:国家自然科学基金(61871278)、成都市产业集群协同创新项目(2016-XT00-00015-GX)、成都市科技惠民技术研发项目(2015-HM01-00293-SF)、东莞市社会科技发展项目(2017507102428)、四川大学研究生课程建设项目(2016KCJS5113)

收稿日期:2018-05-11

修改稿日期:2018-06-28

网络出版日期:2018-07-30

作者单位    点击查看

单倩文:四川大学电子信息学院, 四川 成都 610065
郑新波:东莞前沿技术研究院, 广东 东莞 523000
何小海:四川大学电子信息学院, 四川 成都 610065
滕奇志:四川大学电子信息学院, 四川 成都 610065
吴晓红:四川大学电子信息学院, 四川 成都 610065

联系人作者:何小海(nic5602@scu.edu.cn)

【1】Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6): 84-90.

【2】Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[C]∥International Conference on Learning Representations, May 7-9, 2015, San Diego, USA. New York: Cornell University Library, 2015: 1-14.

【3】Szegedy C, Liu W, Jia Y Q, et al. Going deeper with convolutions[C]∥2015 IEEE Conference on Computer Vision and Pattern Recognition, June 7-12, 2015, Boston, MA, USA. New York: IEEE, 2015: 1-9.

【4】He K M, Zhang X Y, Ren S Q, et al. Deep residual learning for image recognition[C]∥2016 IEEE Conference on Computer Vision and Pattern Recognition, June 27-30, 2016, Las Vegas, NV, USA. New York: IEEE, 2016: 770-778.

【5】Cai Y Z, Yang D D, Mao N, et al. Visual tracking algorithm based on adaptive convolutional features[J]. Acta Optica Sinica, 2017, 37(3): 0315002.
蔡玉柱, 杨德东, 毛宁, 等. 基于自适应卷积特征的目标跟踪算法[J]. 光学学报, 2017, 37(3): 0315002.

【6】Li Q H, Li A H, Wang T, et al. Double-stream convolutional networks with sequential optical flow image for action recognition[J]. Acta Optica Sinica, 2018, 38(6): 0615002.
李庆辉, 李艾华, 王涛, 等. 结合有序光流图和双流卷积网络的行为识别[J]. 光学学报, 2018, 38(6): 0615002.

【7】Wang M, Liu K X, Liu L, et al. Super-resolution reconstruction of image based on optimized convolution neural network[J]. Laser & Optoelectronics Progress, 2017, 54(11): 111005.
王民, 刘可心, 刘利, 等. 基于优化卷积神经网络的图像超分辨率重建[J]. 激光与光电子学进展, 2017, 54(11): 111005.

【8】Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]∥2014 IEEE Conference on Computer Vision and Pattern Recognition, June 23-28, 2014, Columbus, OH, USA. New York: IEEE, 2014: 580-587.

【9】Girshick R. Fast R-CNN[C]∥2015 IEEE International Conference on Computer Vision, December 7-13, 2015, Santiago, Chile. New York: IEEE, 2015: 1440-1448.

【10】Ren S Q, He K M, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149.

【11】Redmon J, Divvala S, Girshick R, et al. Youonly look once: unified, real-time object detection[C]∥2016 IEEE Conference on Computer Vision and Pattern Recognition, June 27-30, 2016, Las Vegas, NV, USA. New York: IEEE, 2016: 779-788.

【12】Liu W, Anguelov D, Erhan D, et al.SSD: single shot multiBox detector[C]∥European Conference on Computer Vision, October 8-16, 2016, Amsterdam, Netherlands. New York: Springer, 2016: 21-37.

【13】Lin T Y,Dollar P, Girshick R, et al. Feature pyramid networks for object detection[C]∥2017 IEEE Conference on Computer Vision and Pattern Recognition, July 21-26, 2017, Honolulu, HI, USA. New York: IEEE, 2017: 936-944.

【14】Ren J, Chen X H, Liu J B, et al. Accurate single stage detector using recurrent rolling convolution[C]∥2017 IEEE Conference on Computer Vision and Pattern Recognition, July 21-26, 2017, Honolulu, HI, USA. New York: IEEE, 2017: 752-760.

【15】Fu C Y, Liu W, Ranga A, et al. DSSD: deconvolutional single shot detector[C/OL]. Computer Vision and Pattern Recognition, January 23, 2017. New York: Cornell University Library, 2017[2018-04-30]. https:∥arxiv.org/abs/1701.06659.

【16】Gao Y, Beijbom O, Zhang N, et al. Compact bilinear pooling[C]∥IEEE Conference on Computer Vision and Pattern Recognition, June 27-30, 2016, Las Vegas, NV, USA. New York: IEEE, 2016: 317-326.

【17】Tenenbaum J B, Freeman W T. Separating style and content with bilinear models[J]. Neural Computation, 2000, 12(6): 1247-1283.

【18】Pham N, Pagh R. Fast and scalable polynomial kernels via explicit feature maps[C]∥KDD′13 Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 11-14, 2013, Chicago, USA. New York: ACM, 2013: 239-247.

【19】Hu J, Shen L, Sun G. Squeeze-and-excitation networks[C/OL]. Computer Vision and Pattern Recognition, September 5, 2017. New York: Cornell University Library, 2017 [2018-04-30]. https:∥arxiv.org/abs/1709.01507.

【20】Everingham M, van Gool L, Williams C K I, et al. The pascal visual object classes (VOC) challenge[J]. International Journal of Computer Vision, 2010, 88(2): 303-338.

【21】Redmon J, Farhadi A. YOLO9000: better, faster, stronger[C]∥2017 IEEE Conference on Computer Vision and Pattern Recognition, July 21-26, 2017, Honolulu, HI, USA. New York: IEEE, 2017: 6517-6525.

引用该论文

Shan Qianwen,Zheng Xinbo,He Xiaohai,Teng Qizhi,Wu Xiaohong. Fast Object Detection and Recognition Algorithm Based on Improved Multi-Scale Feature Maps[J]. Laser & Optoelectronics Progress, 2019, 56(2): 021002

单倩文,郑新波,何小海,滕奇志,吴晓红. 基于改进多尺度特征图的目标快速检测与识别算法[J]. 激光与光电子学进展, 2019, 56(2): 021002

您的浏览器不支持PDF插件,请使用最新的(Chrome/Fire Fox等)浏览器.或者您还可以点击此处下载该论文PDF