首页 > 论文 > 激光与光电子学进展 > 56卷 > 1期(pp:11002--1)

基于改进SSD的实时检测方法

Real-Time Detection Based on Improved Single Shot MultiBox Detector

  • 摘要
  • 论文信息
  • 参考文献
  • 被引情况
  • PDF全文
分享:

摘要

卷积神经网络已广泛应用于目标检测领域, 然而基于卷积神经网络的方法所需要的计算量大, 以至于此类方法难以在计算能力有限的平台上运行。为此提出了一种基于SSD(Single Shot MultiBox Detector)的快速检测方法, 即Faster-SSD, 该方法在计算量有限的平台上达到了实时检测同时保持高精度。将SSD的基础网络更换为ResNet-34; 在生成预测框阶段, 先求得满足条件的先验框, 再生成对应类别的预测框; 提出了一个可变最低阈值来减少计算量; 使用在线难例挖掘来去除简单样本。实验结果显示, 该方法在NVIDIA Jetson TX2上可以达到14 frame/s。

Abstract

In recent years, the convolutional neural networks are widely used in the field of object detection. However, these methods based on convolutional neural networks require a large amount of calculations, so that it is difficult for these methods to run on platforms with limited computation. A fast object detection method is proposed based on single shot multibox detector (SSD), namely Faster-SSD. The method realizes the real-time detection and high accuracy with limited computation. The basic network of SSD is replaced with ResNet-34. In the stage of generating the prediction frame, first obtain the prior boxes which satisfy the condition, and then generate the prediction frame of the corresponding category. The variable minimum threshold is proposed to reduce the amount of computation. Finally, the online hard example mining is applied to remove the simple samples. Experimental results show that the Faster-SSD gets 14 frame/s on NVIDIA Jetson TX2.

Newport宣传-MKS新实验室计划
补充资料

中图分类号:O436

DOI:10.3788/lop56.011002

所属栏目:图像处理

基金项目:教育部-中国移动科研基金项目(MCM20170204)、青海省高端创新人才千人计划(2016280)

收稿日期:2018-06-04

修改稿日期:2018-06-26

网络出版日期:2018-07-18

作者单位    点击查看

陈立里:江南大学物联网应用技术教育部工程中心, 江苏 无锡 214122
张正道:江南大学物联网应用技术教育部工程中心, 江苏 无锡 214122
彭力:江南大学物联网应用技术教育部工程中心, 江苏 无锡 214122无锡太湖学院江苏省物联网应用技术重点实验室, 江苏 无锡 214122

联系人作者:张正道(wxzzd@hotmail.com)

【1】Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]∥IEEE Conference on Computer Vision and Pattern Recognition, 2014: 580-587.

【2】Girshick R. Fast R-CNN[C]∥IEEE International Conference on Computer Vision (ICCV), 2015: 1440-1448.

【3】Ren S Q, He K M, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149.

【4】Dai J, Li Y, He K, et al. R-FCN: object detection via region-based fully convolutional networks[J]. arXiv preprint arXiv: 1605.06409, 2016.

【5】Redmon J, Divvala S, Girshick R, et al. You only look once: unified, real-time object detection[C]∥ IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016: 779-788.

【6】Redmon J, Farhadi A.YOLO9000: better, faster, stronger[C]∥IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017: 6517-6525.

【7】Liu W, Anguelov D, Erhan D, et al. SSD:single shot multibox detector[C]∥European Conference on Computer Vision, 2016: 21-37.

【8】Dalal N, Triggs B. Histograms of oriented gradients for Human detection[C]∥IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR''05), 2005: 886-893.

【9】Lowe D G. Distinctive image features from scale-invariant keypoints[J]. International Journal of Computer Vision, 2004, 60(2): 91-110.

【10】Felzenszwalb P, McAllester D, Ramanan D. A discriminatively trained, multiscale, deformable part model[C]∥IEEE Conference on Computer Vision and Pattern Recognition, 2008: 1-8.

【11】Azizpour H, Laptev I. Object detection using strongly-supervised deformable part models[C]∥ European Conference on Computer Vision, 2014: 836-849.

【12】Dollar P,Appel R, Belongie S, et al. Fast feature pyramids for object detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(8): 1532-1545.

【13】Everingham M,van Gool L, Williams C K I, et al. The pascal visual object classes (VOC) challenge[J]. International Journal of Computer Vision, 2010, 88(2): 303-338.

【14】He K M, Zhang X Y, Ren S Q, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904-1916.

【15】Cai Z, Vasconcelos N. Cascade R-CNN: delving into high quality object detection[C]∥Computer Vision and Pattern Recognition, 2018: 6154-6162.

【16】Ye G L, Sun S Y, Gao K J, et al. Nighttime pedestrian detection based on faster region convolution neural network[J]. Laser & Optoelectronics Progress, 2017, 54(8): 081003.
叶国林, 孙韶媛, 高凯珺, 等. 基于加速区域卷积神经网络的夜间行人检测研究[J]. 激光与光电子学进展, 2017, 54(8): 081003.

【17】Huang X Y,Xu J L,Guo G, et al. Real-time pedestrian reidentification based on enhanced aggregated channel features[J]. Laser & Optoelectronics Progress, 2017, 54(9): 091001.
黄新宇, 许娇龙, 郭纲, 等. 基于增强聚合通道特征的实时行人重识别[J]. 激光与光电子学进展, 2017, 54(9): 091001.

【18】Lu Y S, Li Y X, Liu B, et al. Hyperspectral data haze monitoring based on deep residual network[J]. Acta Optica Sinica, 2017, 37(11): 1128001.
陆永帅, 李元祥, 刘波, 等. 基于深度残差网络的高光谱遥感数据霾监测[J]. 光学学报, 2017, 37(11): 1128001.

【19】Redmon J, Farhadi A. YOLOv3: an incremental improvement[C]∥Computer Vision and Pattern Recognition, 2018.

【20】Lin T Y, Goyal P, Girshick R, et al. Focal loss for dense object detection[C]∥IEEE International Conference on Computer Vision (ICCV), 2017: 2999-3007.

【21】Fu C Y, Liu W, Ranga A, et al. DSSD: deconvolutional single shot detector[C]∥Computer Vision and Pattern Recognition, 2017.

【22】Cao G M, Xie X M, Yang W Z, et al. Feature-fused SSD: fast detection for small objects[J].Proceedings of SPIE, 2018: 106151E.

【23】Jeong J, Park H, Kwak N. Enhancement of SSD by concatenating feature maps for object detection[C]∥ British Machine Vision Conference, 2017.

【24】Hosang J, Benenson R, Schiele B. A convnet for non-maximum suppression [C]∥German Conference on Pattern Recognition, 2016: 192-204.

【25】Shrivastava A, Gupta A, Girshick R. Training region-based object detectors with online hard example mining[C]∥IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016: 761-769.

【26】Jia Y Q, Shelhamer E, Donahue J, et al. Caffe[C]∥Proceedings of the ACM International Conference on Multimedia, 2014.

【27】Simonyan K, Zisserman A. Very Deep Convolutional Networks for Large-Scale Image Recognition[J]. arXiv preprint arXiv: 1409. 1556, 2014.

【28】He K M, Zhang X Y, Ren S Q, et al. Deep residual learning for image recognition[C]∥ IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016: 770-778.

【29】Huang G, Liu Z, Maaten L V D, et al. Densely connected convolutional networks[C]∥ IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017: 2261-2269.

【30】Zhang X, Zhou X, Lin M, et al. ShuffleNet: an extremely efficient convolutional neural network for mobile devices[C]∥ Computer Vision and Pattern Recognition, 2018.

【31】Sandler M, Howard A, Zhu M, et al. Inverted residuals and linear bottlenecks: mobile networks for classification, detection and segmentation[C]∥Computer Vision and Pattern Recognition, 2018.

【32】Ioffe S, Szegedy C. Batch normalization: accelerating deep network training by reducing internal covariate shift[C]∥International Conference on Machine Learning, 2015: 448-456.

引用该论文

Chen Lili,Zhang Zhengdao,Peng Li. Real-Time Detection Based on Improved Single Shot MultiBox Detector[J]. Laser & Optoelectronics Progress, 2019, 56(1): 011002

陈立里,张正道,彭力. 基于改进SSD的实时检测方法[J]. 激光与光电子学进展, 2019, 56(1): 011002

您的浏览器不支持PDF插件,请使用最新的(Chrome/Fire Fox等)浏览器.或者您还可以点击此处下载该论文PDF