基于深度学习的目标检测技术的研究综述

罗元; 王薄宇; 陈旭

doi:doi:10.16818/j.issn1001-5868.2020.01.001

半导体光电, 2020, 41 (1): 1, 网络出版: 2020-04-13

基于深度学习的目标检测技术的研究综述

Research Progresses of Target Detection Technology Based on Deep Learning

罗元王薄宇陈旭

作者单位

重庆邮电大学光电工程学院, 重庆 400065

引用该论文

罗元, 王薄宇, 陈旭. 基于深度学习的目标检测技术的研究综述[J]. 半导体光电, 2020, 41(1): 1.

LUO Yuan, WANG Boyu, CHEN Xu. Research Progresses of Target Detection Technology Based on Deep Learning[J]. Semiconductor Optoelectronics, 2020, 41(1): 1.

参考文献

[1] Zhu D, Luo Y, Dai L, et al. Salient object detection via a local and global method based on deep residual network[J]. J. of Visual Commun. and Image Represent., 2018(54): 1-9.

[2] Zhang Z, Qiao S, Xie C, et al. Single-shot object detection with enriched semantics[C]// The IEEE Conf. on Computer Vision and Pattern Recognition, 2018: 3610-3621.

[3] Hamid R, Nathan T, Gwak J Y, et al. Generalized intersection over union: A metric and a loss for bounding box regression[C]// The IEEE Conf. on Computer Vision and Pattern Recognition, 2019: 658-670.

[4] Hinton G, Deng L, Yu D, et al. Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups[J]. IEEE Signal Proc. Magazine, 2012, 29(6): 82-97.

[5] Rosenblatt F. The perception: A probabilistic model for information storage and organization in the brain[J]. Psychological Rev., 1958, 65(6): 386-408.

[6] Sun D, Wulff J, Sudderth E B, et al. A fully-connected layered model of foreground and background flow[C]// 2013 IEEE Conf. on Computer Vision and Pattern Recognition, 2013: 2451-2458.

[7] Vincent P, Larochelle H, Bengio Y, et al. Extracting and composing robust features with denoising autoencoders[C]// Machine Learning, Proceedings of the Twenty-Fifth International Conference (ICML 2008), 2008: 1096-1103.

[8] Vincent P, Larochelle H, Lajoie I, et al. Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion[J]. J. of Machine Learning Research, 2010, 11(12): 3371-3408.

[9] Jiang X, Zhang Y, Zhang W, et al. A novel sparse auto-encoder for deep unsupervised learning[C]// Sixth Inter. Conf. on Adv. Computational Intelligence, 2013: 256-261.

[10] Hinton G E. A practical guide to training restricted Boltzmann machines[J]. Momentum, 2010, 9(1): 926-947.

[11] Hinton G. Boltzmann machine[J]. Encyclopedia of Machine Learning, 2007, 2(5): 119-129.

[12] Krizhevsky A, Sutskever I, Hinton G. ImageNet classification with deep convolutional neural networks[J]. Adv. in Neural Information Processing Systems, 2012, 25(2): 1097-1105.

[13] Bernier J L, Ortega J, Rodriguez M M, et al. An Accurate Measure for Multilayer Perception Tolerance to Additive Weight Deviations[M]. Engin. Applications of Bio-Inspired Artificial Neural Networks, Berlin: Springer, 1999: 121-130.

[14] Phan N H,Wang Y, Wu X T, et al. Differential privacy preservation for deep auto-encoders: an application of human behavior prediction[C]// 30th AAAI Conf. on Artificial Intelligence, 2016: 179-185.

[15] Hinton G, Deng L, Yu D, et al. Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups[J]. IEEE Signal Proc. Magazine, 2012, 29(6): 82-97.

[16] Salakhutdinov R, Hinton G. Using deep belief nets to learn covariance kernels for Gaussian processes[C]// Inter. Conf. on Neural Information Proc. Syst., 2008(20): 1347-1355.

[17] Cho K H, Raiko T, Ilin A. Gaussian-Bernoulli deep Boltzmann machine[C]// IEEE The 2013 Inter. Joint Conf. on Neural Networks (IJCNN), 2013(10): 561-570.

[18] Huang P S, Kim M, Hasegawa-Johnson M, et al. Joint optimization of masks and deep recurrent neural networks for monaural source separation[J]. IEEE/ACM Trans. on Audio Speech & Language Proc., 2015, 23(12): 2136-2147.

[19] Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks[C]// Inter. Conf. on Neural Information Proc. Sys., 2012(1):1097-1105.

[20] Girshick R, Donahue J, Darrelland T, et al. Rich feature hierarchies for object detection and semantic segmentation[C]// 2014 IEEE Conf. on Computer Vision and Pattern Recognition, 2014: 1-21.

[21] He K, Zhang X, Ren S, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Trans. on Pattern Analysis & Machine Intelligence, 2014, 37(9): 1904-1916.

[22] Girshick R. Fast R-CNN[J]. Computer Science, 2015(4): 169-178.

[23] Ren S, He K, Girshick R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[J]. IEEE Trans. on Pattern Analysis & Machine Intelligence, 2017, 39(6): 1137-1149.

[24] 郭毓, 苏鹏飞, 吴益飞. 基于Faster R-CNN的机器人目标检测及空间定位[J]. 华中科技大学学报(自然科学版), 2018, 46(12): 60-64.

Guo Yu, Su Pengfei, Wu Yifei. Object detection and location of robot based on Faster R-CNN[J]. J. of Huazhong University of Science and Technol. (Nature Science Edi.), 2018, 46(12): 60-64.

[25] He K, Gkioxari G, Dollar P, et al. Mask R-CNN[J]. IEEE Trans. on Pattern Analysis & Machine Intelligence, 2018(42): 386-397.

[26] Redmon J, Divvala S, Girshick R, et al. You only look once: unified, real-time object detection[J]. arXiv e-prints, 2015(6): 2640-2650.

[27] Redmon J, Farhadi A. YOLO9000: better, faster, stronger[C]// IEEE 2017 IEEE Conf. Computer Vision and Pattern Recognition(CVPR), 2017: 6517-6525.

[28] Redmon J, Farhadi A. YOLOv3: An incremental improvement[C]// 2018 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2018: 2767-2773.

[29] Liu W, Anguelov D, Erhan D, et al. SSD: Single shot multibox detector[J]. ECCV 2016: Computer Vision, 2016: 21-37.

[30] Vicente S, Carreira J, Agapito L, et al. Reconstructing PASCAL VOC[J]. Computer Vision & Pattern Recognition, 2014, 10(5): 111-122.

[31] Russakovsky O, Deng J, Su H, et al. ImageNet large scale visual recognition challenge[J]. Inter. J. of Computer Vision, 2015, 115(3): 211-252.

[32] Kuznetsova A, Rom H, Alldrin N, et al. The open images dataset V4: Unified image classification, object detection and visual relationship detection at scale[J]. arXiv:1811.00982, 2018, 18(4): 111-119.

[33] Wang Xinlong, Xiao Tete, Jiang Yuning, et al. Repulsion loss: detecting pedestrians in a crowd[C]// 2018 IEEE/CVF Conf. on Computer Vision and Pattern Recognition, 2018, 1(4): 7774-7783.

[34] Liu Wei, Liao Shengcai, Ren Weiqiang, et al. High-level semantic feature detection: a new perspective for pedestrian detection[C]// 2019 IEEE Conf. on Computer Vision and Pattern Recognition, 2018: 2167-2173.

[35] Quellec G, Charriere K, Boudi Y, et al. Deep image mining for diabetic retinopathy screening[J]. Medical Image Analysis, 2017, 39: 178-193.

[36] Li Zhuolin, Dong Minghui, Wen Shiping, et al. CLU-CNNs: Object detection for medical images[J]. Neurocomputing, 2019, 350: 53-59.

[37] Li J, Wang Y, Wang C, et al. DSFD: Dual shot face detector[C]// 2019 IEEE Conf. on Computer Vision and Pattern Recognition, 2019: 1215-1224.

[38] Niu Xuesong, Han Hu, Yang Songfan, et al. Relationship learning with person-specific regularization for facial action unit detection[C]// 2019 IEEE Conf. on Computer Vision and Pattern Recognition, 2019: 411-420.

罗元, 王薄宇, 陈旭. 基于深度学习的目标检测技术的研究综述[J]. 半导体光电, 2020, 41(1): 1. LUO Yuan, WANG Boyu, CHEN Xu. Research Progresses of Target Detection Technology Based on Deep Learning[J]. Semiconductor Optoelectronics, 2020, 41(1): 1.

基于深度学习的目标检测技术的研究综述

关于本站 Cookie 的使用提示

全站搜索

基于深度学习的目标检测技术的研究综述

相关论文

相关资讯

关于本站 Cookie 的使用提示

全站搜索