首页 > 论文 > 激光与光电子学进展 > 56卷 > 23期(pp:231007--1)

融合多尺度特征的目标检测模型

Object Detection Model Based on Multi-Scale Feature Integration

  • 摘要
  • 论文信息
  • 参考文献
  • 被引情况
  • PDF全文
分享:

摘要

为使YOLOv2算法在保证检测速度的同时进一步提高目标检测的精确率,在YOLOv2模型的基础上提出RF-YOLOv2新模型。该模型先将KITTI数据集经过聚类,选出最适合KITTI数据集的候选框个数和候选框尺寸;其次在网络结构的训练部分采用残差块结构增加卷积层,提取更符合目标的特征描述;最后在网络结构的检测部分引入特征金字塔网络,将不同尺寸大小的特征图进行融合,使得低层特征图也具有丰富的语义信息。实验结果表明,RF-YOLOv2模型能获得更深层的特征、能融合更多尺寸的目标信息,改善了目标检测过程中由实际道路场景复杂、目标外形和结构多变等特点导致的检测率不高问题,在保证算法实时性的条件下,提高了对目标检测的精确率,RF-YOLOv2模型对大目标检测效果更佳。

Abstract

To ensure detection speed and further improve object detection accuracy, a new model RF-YOLOv2 is proposed on the basis of the YOLOv2 model. In this new model, the KITTI data set is first clustered to select the most suitable number and size of candidate boxes. Next, a residual block structure is used to increase the number of convolutional layers in the training part of the network structure. This increase helps the model to extract more strong features to better describe objects. Finally, a feature pyramid network is introduced in the detection part of the network structure, fusing the feature graphs with different sizes. This network allows even low-level feature graphs to capture rich semantic information. Experimental results show that the RF-YOLOv2 model can gain the deeper information about features and can integrate more object size information. These improvements alleviate significant problems in current models that lead to low detection rates when actual road scenes are complex or when objects vary in shape or structure. The proposed model also improves object detection accuracy in real time detection and achieves better results for large object detection.

Newport宣传-MKS新实验室计划
补充资料

DOI:10.3788/LOP56.231007

所属栏目:图像处理

基金项目:国家自然科学基金青年基金项目、辽宁省自然科学基金、第六批生产技术问题创新研究基金;

收稿日期:2019-05-10

修改稿日期:2019-06-03

网络出版日期:2019-12-01

作者单位    点击查看

刘万军:辽宁工程技术大学软件学院, 辽宁 葫芦岛 125105
王凤:辽宁工程技术大学软件学院, 辽宁 葫芦岛 125105
曲海成:辽宁工程技术大学软件学院, 辽宁 葫芦岛 125105

联系人作者:王凤(838808390@qq.com)

备注:国家自然科学基金青年基金项目、辽宁省自然科学基金、第六批生产技术问题创新研究基金;

【1】Zhang H, Wang K F, Wang F Y. Advances and perspectives on applications of deep learning in visual object detection [J]. Acta Automatica Sinica. 2017, 43(8): 1289-1305.
张慧, 王坤峰, 王飞跃. 深度学习在目标视觉检测中的应用进展与展望 [J]. 自动化学报. 2017, 43(8): 1289-1305.

【2】Zhou F Y, Jin L P, Dong J. Review of convolutional neural network [J]. Chinese Journal of Computers. 2017, 40(6): 1229-1251.
周飞燕, 金林鹏, 董军. 卷积神经网络研究综述 [J]. 计算机学报. 2017, 40(6): 1229-1251.

【3】Wang H X, Dong H, Zhou Z Q. Review on dim small target detection technologies in infrared single frame images [J]. Laser & Optoelectronics Progress. 2019, 56(8): 080001.
王好贤, 董衡, 周志权. 红外单帧图像弱小目标检测技术综述 [J]. 激光与光电子学进展. 2019, 56(8): 080001.

【4】Ou P, Zhang Z, Lu K, et al. Object detection in of remote sensing images based on convolutional neural networks [J]. Laser & Optoelectronics Progress. 2019, 56(5): 051002.
欧攀, 张正, 路奎, 等. 基于卷积神经网络的遥感图像目标检测 [J]. 激光与光电子学进展. 2019, 56(5): 051002.

【5】Felzenszwalb P, Girshick R. McAllester D, et al. Visual object detection with deformable part models [J]. Communications of the ACM. 2013, 56(9): 97-105.

【6】Dalal N, Triggs B. Histograms of oriented gradients for human detection . [C]∥2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR''''05), June 20-25, 2005, San Diego, CA, USA. New York: IEEE. 2005, 8588935.

【7】Lin C F, Wang S D. Fuzzy support vector machines [J]. IEEE Transactions on Neural Networks. 2002, 13(2): 464-471.

【8】Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation . [C]∥2014 IEEE Conference on Computer Vision and Pattern Recognition, June 23-28, 2014, Columbus, OH, USA. New York: IEEE. 2014, 580-587.

【9】Girshick R. Fast R-CNN . [C]∥2015 IEEE International Conference on Computer Vision (ICCV), December 7-13, 2015, Santiago, Chile. New York: IEEE. 2015, 1440-1448.

【10】Ren S Q, He K M, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2017, 39(6): 1137-1149.

【11】Uijlings J R R, Gevers T, et al. . Selective search for object recognition [J]. International Journal of Computer Vision. 2013, 104(2): 154-171.

【12】Redmon J, Divvala S, Girshick R, et al. You only look once: unified, real-time object detection . [C]∥2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 27-30, 2016, Las Vegas, NV, USA. New York: IEEE. 2016, 779-788.

【13】Liu W, Anguelov D, Erhan D, et al. SSD: single shot MultiBox detector [M]. ∥Leibe B, Matas J, Sebe N, et al. Computer vision-ECCV 2016. Lecture notes in computer science. Cham: Springer. 2016, 9905: 21-37.

【14】Wang X Q, Wang X J. Real-time target detection method applied to embedded graphic processing unit [J]. Acta Optica Sinica. 2019, 39(3): 0315005.
王晓青, 王向军. 应用于嵌入式图形处理器的实时目标检测方法 [J]. 光学学报. 2019, 39(3): 0315005.

【15】Szegedy C, Liu W, Jia Y Q, et al. Going deeper with convolutions . [C]∥2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 7-12, 2015, Boston, MA, USA. New York: IEEE. 2015, 15523970.

【16】Redmon J, Farhadi A. YOLO9000: better, faster, stronger . [C]∥2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 21-26, 2017, Honolulu, HI, USA. New York: IEEE. 2017, 6517-6525.

【17】Redmon J. -04-08)[2019-05-09] . https:∥arxiv. 2018, org/abs/1804: 02767.

【18】He K M, Zhang X Y, Ren S Q, et al. Deep residual learning for image recognition . [C]∥2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 27-30, 2016, Las Vegas, NV, USA. New York: IEEE. 2016, 770-778.

【19】Lin T Y, Dollár P, Girshick R, et al. Feature pyramid networks for object detection . [C]∥2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 21-26, 2017, Honolulu, HI, USA. New York: IEEE. 2017, 936-944.

【20】Geiger A, Lenz P, Urtasun R. Are we ready for autonomous driving? The KITTI vision benchmark suite . [C]∥2012 IEEE Conference on Computer Vision and Pattern Recognition, June 16-21, 2012, Providence, RI, USA. New York: IEEE. 2012, 3354-3361.

【21】Everingham M. Eslami S M A, van Gool L, et al. The Pascal visual object classes challenge: a retrospective [J]. International Journal of Computer Vision. 2015, 111(1): 98-136.

引用该论文

Liu Wanjun,Wang Feng,Qu Haicheng. Object Detection Model Based on Multi-Scale Feature Integration[J]. Laser & Optoelectronics Progress, 2019, 56(23): 231007

刘万军,王凤,曲海成. 融合多尺度特征的目标检测模型[J]. 激光与光电子学进展, 2019, 56(23): 231007

您的浏览器不支持PDF插件,请使用最新的(Chrome/Fire Fox等)浏览器.或者您还可以点击此处下载该论文PDF