首页 > 论文 > 激光与光电子学进展 > 57卷 > 12期(pp:120005--1)

深度学习目标检测方法及其主流框架综述

Review of Deep Learning Based Object Detection Methods and Their Mainstream Frameworks

  • 摘要
  • 论文信息
  • 参考文献
  • 被引情况
  • PDF全文
分享:

摘要

目标检测作为机器视觉中重要任务之一,是人工智能体系中一个具有重要研究价值的技术分支。对于卷积神经网络框架、anchor-based模型和anchor-free模型三个主流的目标检测模型进行梳理。首先,综述了主流卷积神经网络框架的网络结构、优缺点以及相关的改进方法;其次从one-stage和two-stage两个分支对anchor-based类模型进行深入分析,总结了不同目标检测方法的研究进展;从早期探索、关键点和密集预测三部分分析anchor-free类模型。最后对该领域的未来发展趋势进行了思考与展望。

Abstract

As one of the important tasks in machine vision, object detection is a technology branch with important research value in artificial intelligence systems. The three mainstream object detection models of convolutional neural network framework, anchor-based model, and anchor-free model are analyzed. First, the network structure and the advantages and disadvantages of the mainstream convolutional neural network framework, and the related improvement methods are reviewed. Second, the anchor-based model is deeply analyzed from one-stage and two-stage branches, and the research progresses of different object detection methods are summarized. The anchor-free model is analyzed from three parts: early exploration, key points, and intensive prediction. Finally, the future development trend of the field is considered and prospected.

广告组1 - 空间光调制器+DMD
补充资料

中图分类号:TP391

DOI:10.3788/LOP57.120005

所属栏目:综述

基金项目:国家自然科学基金、工信部资助项目、贵州省科技计划、黔教合协同创新字[2015]002、贵州省研究生创新基金;

收稿日期:2019-11-11

修改稿日期:2019-12-06

网络出版日期:2020-06-01

作者单位    点击查看

段仲静:贵州大学现代制造技术教育部重点实验室, 贵州 贵阳 550025
李少波:贵州大学现代制造技术教育部重点实验室, 贵州 贵阳 550025贵州大学机械工程学院, 贵州 贵阳 550025
胡建军:贵州大学机械工程学院, 贵州 贵阳 550025
杨静:贵州大学机械工程学院, 贵州 贵阳 550025
王铮:贵州大学机械工程学院, 贵州 贵阳 550025

联系人作者:李少波(lishaobo@gzu.edu.cn)

备注:国家自然科学基金、工信部资助项目、贵州省科技计划、黔教合协同创新字[2015]002、贵州省研究生创新基金;

【1】Yin H P, Chen B, Chai Y, et al. Vision-based object detection and tracking: a review [J]. Acta Automatica Sinica. 2016, 42(10): 1466-1489.
尹宏鹏, 陈波, 柴毅, 等. 基于视觉的目标检测与跟踪综述 [J]. 自动化学报. 2016, 42(10): 1466-1489.

【2】Zhang X Y, Gao H B, Zhao J H, et al. Overview of deep learning intelligent driving methods [J]. Journal of Tsinghua University(Science and Technology). 2018, 58(4): 438-444.
张新钰, 高洪波, 赵建辉, 等. 基于深度学习的自动驾驶技术综述 [J]. 清华大学学报(自然科学版). 2018, 58(4): 438-444.
Zhang X Y, Gao H B, Zhao J H, et al. Overview of deep learning intelligent driving methods [J]. Journal of Tsinghua University(Science and Technology). 2018, 58(4): 438-444.
张新钰, 高洪波, 赵建辉, 等. 基于深度学习的自动驾驶技术综述 [J]. 清华大学学报(自然科学版). 2018, 58(4): 438-444.

【3】Li H B, Xu C Y, Hu C C. Improved real-time vehicle detection method based on YoLOV3 [J]. Laser & Optoelectronics Progress. 2020, 57(10): 101507.
李汉冰, 徐春阳, 胡超超. 基于YOLOV3改进的实时车辆检测方法 [J]. 激光与光电子学进展. 2020, 57(10): 101507.

【4】Li X, Shi B B, Liu Y, et al. Multi-target recognition method based on improved YOLOv2 model [J]. Laser & Optoelectronics Progress. 2020, 57(10): 101010.
李珣, 时斌斌, 刘洋, 等. 基于改进YOLOv2模型的多车辆目标识别方法 [J]. 激光与光电子学进展. 2020, 57(10): 101010.

【5】Wang D C, Chen X N, Zhao F, et al. Vehicle detection algorithm based on convolutional neural network and RGB-D images [J]. Laser & Optoelectronics Progress. 2019, 56(18): 181003.
王得成, 陈向宁, 赵峰, 等. 基于卷积神经网络和RGB-D图像的车辆检测算法 [J]. 激光与光电子学进展. 2019, 56(18): 181003.

【6】Gowsikhaa D, Abirami S, Baskaran R. Automated human behavior analysis from surveillance videos: a survey [J]. Artificial Intelligence Review. 2014, 42(4): 747-765.

【7】Huang K Q, Chen X T, Kang Y F, et al. Intelligent visual surveillance: a review [J]. Chinese Journal of Computers. 2015, 38(6): 1093-1118.
黄凯奇, 陈晓棠, 康运锋, 等. 智能视频监控技术综述 [J]. 计算机学报. 2015, 38(6): 1093-1118.

【8】Li S B, Yang J, Wang Z, et al. -04-04)[2019-11-26] . https:∥doi.org/10. 16383/j.aas. 2019, c180538.
李少波, 杨静, 王铮, 等. -04-04)[2019-11-26] . https:∥doi.org/10. 16383/j.aas. 2019, c180538.

【9】LeCun Y, Bottou L, Bengio Y, et al. Gradient-based learning applied to document recognition [J]. Proceedings of the IEEE. 1998, 86(11): 2278-2324.

【10】Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks . [C]∥Proceedings of the 25th International Conference on Neural Information Processing Systems, December 3-6, 2012, Lake Tahoe, Nevada. New York: ACM. 2012, 1: 1097-1105.

【11】Image data set [2019-11-11].http:∥www.image-net.[2019-11-11]. 0.

【12】Yang G C, Yang J, Li S B, et al. Modified CNN algorithm based on Dropout and ADAM optimizer [J]. Journal of Huazhong University of Science and Technology (Natural Science Edition). 2018, 46(7): 122-127.
杨观赐, 杨静, 李少波, 等. 基于Dopout与ADAM优化器的改进CNN算法 [J]. 华中科技大学学报(自然科学版). 2018, 46(7): 122-127.

【13】Nair V, Hinton G E. Rectified linear units improve restricted Boltzmann machines . [C]∥Proceedings of the 27th International Conference on Machine Learning(ICML), Haifa. 2010, 807-814.

【14】Yang J, Li S B, Gao Z, et al. Real-time recognition method for 0.8 cm darning needle and KR22 bearing based on convolution neural network and data increase [J]. Applied Sciences. 2018, 8(10): 1857.

【15】Simonyan K. -09-04)[2019-11-26] . https:∥arxiv.org/abs/1409.1556v1. 2014.

【16】Shelhamer E, Long J, Darrell T. Fully convolutional networks for semantic segmentation [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2017, 39(4): 640-651.

【17】Ronneberger O, Fischer P. -05-18)[2019-11-26] . https:∥arxiv. 2015, org/abs/1505: 04597.

【18】Badrinarayanan V, Handa A. -11-01)[2019-11-26] . http:∥de.arxiv. 2015, org/pdf/1511: 00561.

【19】Szegedy C, Liu W, Jia Y, et al. -09-17)[2019-11-26] . https:∥arxiv.org/. 2014, abs/1409: 4842.

【20】He K, Zhang X, Ren S, et al. -12-10)[2019-11-26] . https:∥ arxiv. 2015, org/abs/1512: 03385.

【21】Huang G, Liu Z, Laurens V D M, et al. -08-25)[2019-11-26] . https:∥arxiv. 2016, org/abs/1608: 06993.

【22】Girshick R, Donahue J, Darrell T, semantic segmentation[EB/OL], et al. -11-11)[2019-11-26] . https:∥arxiv. 2013, org/abs/1311: 2524.

【23】Uijlings J R R, Gevers T, et al. Segmentation as selective search for object recognition . [C]∥2011 International Conference on Computer Vision, November 6-13, 2011, Barcelona, Spain. IEEE. 2011, 154-171.

【24】He K M, Zhang X Y, Ren S Q, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2015, 37(9): 1904-1916.He K M, Zhang X Y, Ren S Q, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2015, 37(9): 1904-1916.

【25】Girshick R. Fast R-CNN[EB/OL]. -04-30)[2019-11-26] . https:∥arxiv. 2015, org/abs/1504: 08083.

【26】Scholkopf B, Platt J. -12-04)[2019-11-26] . https:∥dl.acm.org/ citation.cfm?id=2976462. 2006.

【27】Ren S Q, He K M, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2017, 39(6): 1137-1149.

【28】Hosang J, Benenson R, Dollar P, et al. What makes for effective detection proposals? [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2016, 38(4): 814-830.

【29】Dai J, Li Y, He K, et al. -05-20)[2019-11-26] . https:∥arxiv.org/abs/1605.06409?context=cs. 2016.

【30】Agrawal P, Girshick R. -07-07)[2019-11-26] . https:∥arxiv. 2014, org/abs/1407: 1610.

【31】Hinton G E, Srivastava N, Krizhevsky A, et al. Improving neural networks by preventing co-adaptation of feature detectors [J]. Computer Science. 2012, 3(4): 212-223.

【32】Lin T Y, Dollar P, Girshick R, et al. -12-09)[2019-11-26] . https:∥arxiv. 2016, org/abs/1612: 03144.

【33】He K M, Gkioxari G, Dollar P, et al. Mask R-CNN [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2020, 42(2): 386-397.

【34】Xie S, Girshick R, Dollar P, et al. -11-16)[2019-11-26] . https:∥arxiv. 2016, org/abs/1611: 05431.

【35】Peng C, Xiao T, Li Z M, et al. -11-20)[2019-11-26] . https:∥arxiv. 2017, org/abs/1711: 07240.

【36】Qin L K, Gong Y F, Tang T Q, et al. Training deep nets with progressive batch normalization on multi-GPUs [J]. International Journal of Parallel Programming. 2019, 47(3): 373-387.

【37】Gidaris S. -05-07)[2019-11-26] . https:∥arxiv. 2015, org/abs/1505: 01749.

【38】Kong T, Yao A B, Chen Y R, joint object detection[EB/OL], et al. -04-03)[2019-11-26] . https:∥arxiv. 2016, org/abs/1604: 00600.

【39】Yang B, Yan J J, Zhen L, et al. -04-12)[2019-11-26] . https:∥arxiv.org/abs/. 2016, 1604: 03239.

【40】Wang X, Shrivastava A. -04-11)[2019-11-26] . https:∥arxiv. 2017, org/abs/1704: 03414.

【41】Li Z M, Peng C, Yu G, et al. -11-20)[2019-11-26] . https:∥arxiv. 2017, org/abs/1711: 07264.

【42】Cai Z W. -12-03)[2019-11-26] . https:∥arxiv. 2017, org/abs/1712: 00726.

【43】Singh B. -11-22)[2019-11-26] . https:∥arxiv. 2017, org/abs/1711: 08189.

【44】Ghiasi G, Lin T Y, Pang R, et al. -04-16)[2019-11-26] . https:∥arxiv. 2019, org/abs/1904: 07392.

【45】Li Y, Chen Y, Wang N, et al. -01-07) https:∥arxiv.org/abs/1901.01892?context=cs [2019-11-26]. CV. 2019.

【46】Sermanet P, Eigen D, Zhang X, detection using convolutional networks[EB/OL], et al. -12-21)[2019-11-26] . https:∥arxiv. 2013, org/abs/1312: 6229.

【47】Redmon J, Divvala S, Girshick R, real-time object detection[EB/OL], et al. -06-08)[2019-11-26] . https:∥arxiv. 2015, org/abs/1506: 02640.

【48】Liu W, Anguelov D, Erhan D, et al. SSD: single shot MultiBox detector[M]. ∥Computer Vision-ECCV 2016. Cham: , 2016, 21-37.

【49】Chen L C, Papandreou G, Kokkinos I, et al. DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2018, 40(4): 834-848.

【50】Fu C Y, Lin W, Ranga A, et al. -01-23)[2019-11-26] . https:∥arxiv.org/abs/1701.06659v1. 2017.

【51】Redmon J, Farhadi A. YOLO9000: better, faster, stronger . [C]∥2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 21-26, 2017. Honolulu, HI. IEEE. 2017, 6517-6525.

【52】Lin T Y, Goyal P, Girshick R, et al. -08-07)[2019-11-26] . https:∥arxiv.org/abs/1708.02002v2. 2017.

【53】Redmon J. -04-08)[2019-11-26] . https:∥arxiv. 2018, org/abs/1804: 02767.

【54】Najibi M, Rastegari M. -12-24)[2019-11-26] . https:∥arxiv. 2015, org/abs/1512: 07729.

【55】Jeong J, Park H. -05-26)[2019-11-26] . https:∥arxiv.org/abs/1705.09587v1. 2017.

【56】Shen Z Q, Liu Z, Li G, et al. -08-03) https:∥arxiv.org/abs/1708.01241?context=cs [2019-11-26]. LG. 2017.

【57】Kong T, Sun F C, Yao A B, et al. -07-06)[2019-11-26] . https:∥arxiv.org/abs/1707.01691?context=cs. 2017.

【58】Zhou P, Ni B B, Geng C, et al. -04-05)[2019-11-26] . http:∥ openaccess.thecvf.com/content_cvpr_2018/papers/Zhou_Scale-Transferrable_Object_Detection_CVPR_2018_p- aper.pdf. 2019.

【59】Lin T Y, Dollar P, Girshick R, et al. -12-09)[2019-11-26] . https:∥arxiv.org/abs/1612.03144?context=cs. 2016.

【60】Zhao Q J, Sheng T, Wang Y T, et al. -11-12)[2019-11-26] . https:∥arxiv. 2018, org/abs/1811: 04533.

【61】Zhu C C, He Y H. -03-02)[2019-11-26] . https:∥arxiv.org/abs/1903.00621v1. 2019.

【62】Tian Z, Shen C H, Chen H, et al. -04-02)[2019-11-26] . https:∥arxiv. 2019, org/abs/1904: 01355.

【63】Kong T, Sun F C, Liu H P, et al. -04-08)[2019-11-26] . https:∥arxiv. 2019, org/abs/1904: 03797.

【64】Huang L, Yang Y, Deng Y, et al. -09-16)[2019-11-26] . https:∥arxiv. 2015, org/abs/1509: 04874.

【65】Law H. -08-03)[2019-11-26] . https:∥arxiv. org/abs/1808.01244v1. 2018.

【66】Zhou X Y, Zhuo J C, center points[EB/OL]. -01-23)[2019-11-26] . https:∥arxiv.org/abs/1901.08043v1. 2019.

【67】Duan K W, Bai S, Xie L X, et al. -04-17)[2019-11-26] . https:∥arxiv.org/abs/1904.08189?context=cs. 2019.

【68】Yu J H, Jiang Y, Wang Z Y, et al. -08-04)[2019-11-26] . https:∥arxiv. 2016, org/abs/1608: 01471.

【69】Zhou X Y, Wang D Q, Kr?henbühl P, et al. 2019-04-16) https:∥arxiv.org/abs/[2019-11-26]. 1904, 07850v1.

【70】Lu X, Li B Y, Yue Y X, et al. -11-29)[2019-11-26] . https:∥arxiv.org/abs/1811.12030v1. 2018.

【71】Law H, Teng Y, Russakovsky O, et al. -04-180)[2019-11-26] . https:∥arxiv. 2019, org/abs/1904: 08900.

引用该论文

Duan Zhongjing,Li Shaobo,Hu Jianjun,Yang Jing,Wang Zheng. Review of Deep Learning Based Object Detection Methods and Their Mainstream Frameworks[J]. Laser & Optoelectronics Progress, 2020, 57(12): 120005

段仲静,李少波,胡建军,杨静,王铮. 深度学习目标检测方法及其主流框架综述[J]. 激光与光电子学进展, 2020, 57(12): 120005

您的浏览器不支持PDF插件,请使用最新的(Chrome/Fire Fox等)浏览器.或者您还可以点击此处下载该论文PDF