基于深度卷积神经网络的目标检测研究综述
范丽丽, 赵宏伟, 赵浩宇, 胡黄水, 王振. 基于深度卷积神经网络的目标检测研究综述[J]. 光学 精密工程, 2020, 28(5): 1152.
FAN Li-li, ZHAO Hong-wei, ZHAO Hao-yu, HU Huang-shui, WANG Zhen. Survey of target detection based on deep convolutional neural networks[J]. Optics and Precision Engineering, 2020, 28(5): 1152.
[1] KHAN A, RINNER B, CAVALLLARO A. Cooperative robots to observe moving targets [J]. IEEE Transactions on Cybernetics, 2016, 48(1): 187-198.
[2] SAPUTERA Y P, WAHAB M, ESTU T T. Radar Software Development for the Surveillance of Indonesian Aerospace Sovereignty [C]. 2018 International Conference on Electrical Engineering and Computer Science (ICECOS), IEEE, 2018: 189-194.
[3] ANTON S D, SINH S, SCHOTTEN H D. Anomaly-based Intrusion Detection in Industrial Data with SVM and Random Forests [C]. 2019 International Conference on Software, Telecommunications and Computer Networks (SoftCOM), IEEE, 2019: 1-6.
[4] 王耀东, 朱力强, 余祖俊, 等. 用于机械系统瞬时目标的双视角高速视觉检测系统 [J].光学 精密工程, 2017, 25(10): 2725-2735.
[5] JIANG A Q, HUYNH D. Multiple pedestrian tracking from monocular videos in an interacting multiple model framework [J]. IEEE Transactions on Image Processing, 2017, 27(3): 1361-1375.
[6] 张小荣, 胡炳樑, 潘志斌, 等. 基于张量表示的高光谱图像目标检测 [J].光学 精密工程, 2019, 27(2): 488-498.
[7] 李正周, 曹雷, 邵万兴, 等. 基于空时混沌分析的海面小弱目标检测精密工程 [J].光学 精密工程, 2018, 26(1): 193-199.
[8] LOWE D. Distinctive image features from scale-invariant keypoints [J]. International Journal of Computer Vision, 2004, 60(2): 91-110.
[9] CAI Z W, SABERIAN M, VASCONCELOS N. Learning complexity-aware cascades for deep pedestrian detection [C]. Proceedings of the IEEE International Conference on Computer Vision, 2015: 3361-3369.
[10] VIOLA P, JONES M. Rapid object detection using a boosted cascade of simple features [J]. CVPR, 2001, 1(3): 511-518.
[11] ZHANG C X, ZHANG J S, KIM S W. PBoostGA: pseudo-boosting genetic algorithm for variable ranking and selection [J]. Computational Statistics, 2016, 31(4): 1237-1262.
[12] PEI L, YE M, ZHAO X Z, et al.. Learning spatio-temporal features for action recognition from the side of the video [J]. Signal, Image Video Processing, 2016, 10(1): 199-206.
[13] LECUN Y, BOTTOU L, BENGIO Y, et al.. Haffner, "Gradient-based learning applied to document recognition [J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324.
[14] LECUN Y, BOSER B, DENKER J, et al.. Handwritten digit recognition with a back-propagation network [J]. Advances in Neural Information Processing Systems, 1990: 396-404.
[15] HECHT-NIELSEN. Theory of the backpropagation neural network [J]. Neural networks for perception: Elsevier, 1992: 65-93.
[16] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. Imagenet classification with deep convolutional neural networks [J]. Advances in neural information processing systems, 2012: 1097-1105.
[17] NAIR V, HINTON G E. Rectified linear units improve restricted boltzmann machines [C]. Proceedings of the 27th international conference on machine learning (ICML-10), 2010: 807-814.
[18] HINTON G E, SRIVASTAVA N, KRIZHEVSKY A, et al.. Improving neural networks by preventing co-adaptation of feature detectors [J]. Computer Ence, 2012, 3(4): 212-223.
[19] ZEILER M D, FERGUS R. Visualizing and understanding convolutional networks [J]. European conference on computer vision,Springer, 2014: 818-833.
[20] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition [J]. Computer Ence, 2014.
[21] SZEGEDY C, LIU W, JIA Y, et al.. Going deeper with convolutions [C]. Proceedings of the IEEE conference on computer vision and pattern recognition, 2015: 1-9.
[22] HE K, ZHANG X Y, REN S Q, et al.. Deep residual learning for image recognition [C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 770-778.
[23] NORMALIZATION B. Accelerating deep network training by reducing internal covariate shift [C]. International Conference on International Conference on Machine Learning JMLR, 2015.
[24] UIJLING J R, VAN DE SANDE K E, GEVERS T, et al.. Selective search for object recognition [J]. International Journal of Computer Vision, 2013, 104(2): 154-171.
[25] KUO W, HARIHARAN B, MALIK J. Deepbox: Learning objectness with convolutional networks [C]. Proceedings of the IEEE International Conference on Computer Vision, 2015: 2479-2487.
[26] PINHEIRO P O, LIN T-Y, COLLOBERT R, et al.. Learning to refine object segments [C]. European Conference on Computer Vision, Springer, 2016: 75-91.
[27] GUPTA S, GRISHICK R, ARBELAEZ P, et al.. Learning rich features from RGB-D images for object detection and segmentation [C]. European Conference on Computer Vision, Sringer, 2014: 345-360.
[28] PERRONNIN F, SANCHEZ J, MENSINK T. Improving the fisher kernel for large-scale image classification [C]. European Conference on Computer Vision, Springer, 2010: 143-156.
[29] HE K, ZHANG X Y, REN S Q, et al.. Spatial pyramid pooling in deep convolutional networks for visual recognition [J]. IEEE Transactions on Pattern Analysis Machine Intelligence, 2015, 37(9): 1904-1916.
[30] GIRSHICK R. Fast R-CNN [C]. Proceedings of the IEEE International Conference on Computer Vision, 2015: 1440-1448.
[31] XUE J, LI J, GONG Y. Restructuring of deep neural network acoustic models with singular value decomposition [C]. Interspeech, 2013: 2365-2369.
[32] REN S, HE K, GIRSHICK R, et al.. Faster R-CNN: Towards real-time object detection with region proposal networks [C]. Advances in Neural Information Processing Systems, 2015: 91-99.
[33] DAI J, LI Y, HE K, et al.. R-FCN: Object detection via region-based fully convolutional networks [C]. Advances in Neural Information Processing Systems, 2016: 379-387.
[34] LIN T-Y, MAIRE M, BELONGIE S, et al.. Microsoft coco: Common objects in context [C]. European Conference on Computer Vision, Springer, 2014: 740-755.
[35] LIU W, ANGUELOV D, ERHAN D, et al.. SSD: Single shot multibox detector [C]. European conference on computer vision, Springer, 2016: 21-37.
[36] REDMON J, DIVVALA S, GIRSHICK R, et al.. You Only Look Once: Unified, real-time object detection [C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 779-788.
[37] REDMON J, FARHADI A. YOLO9000: better, faster, stronger [C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017: 7263-7271.
[38] REDMON J, FARHADI A. YOLOV3: An incremental improvement [J]. arXiv preprint arXiv: 1804.02767, 2018.
[39] ERHAN D, SZEGEDY C, TOSHEV, et al.. Scalable object detection using deep neural networks [C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014: 2147-2154.
[40] BELL S, LAWRENCE Z, BALA K, et al.. Inside-outside net: Detecting objects in context with skip pooling and recurrent neural networks [C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 2874-2883.
[41] FU C-Y, LIU W, RANGA A, et al.. DSSD: Deconvolutional single shot detector [J].arXiv preprint arXiv: 1701.06659, 2017.
[42] SHEN Z, LIU Z, LI J, et al.. DSOD: Learning deeply supervised object detectors from scratch [C]. Proceedings of the IEEE International Conference on Computer Vision, 2017: 1919-1927.
[43] LAW H, HENG J. Cornernet: Detecting objects as paired keypoints [C]. Proceedings of the European Conference on Computer Vision (ECCV), 2018: 734-750.
[44] ZHU C, HE Y, SAVVIDES M. Feature selective anchor-free module for single-shot object detection [J]. arXiv preprint arXiv: 00621, 2019.
[45] ZHOU X, ZHOU J, KRAHENBUHL P. Bottom-up object detection by grouping extreme and center points [C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019: 850-859.
[46] TIAN Z, SHEN C, CHEN H, et al.. FCOS: Fully Convolutional One-Stage Object Detection [J]. arXiv preprint arXiv: 01355, 2019.
[47] DUAN K, BAI S,XIE L, et al.. Centernet: Keypoint triplets for object detection [C]. Proceedings of the IEEE International Conference on Computer Vision, 2019: 6569-6578.
[48] EVERINGHAM M, WAN G, WILLIAMS C, et al.. The pascal visual object classes (voc) challenge [J]. International Journal of Computer Vision, 2010, 88(2): 303-338.
[49] KUZNETSOVA A, ROM H, ALLDRIN N, et al.. The open images dataset v4: Unified image classification, object detection, and visual relationship detection at scale [J]. arXiv preprint arXiv: 1811.00982, 2018.
[50] DENG J,DONG W, SOCHER R, et al.. Imagenet: A large-scale hierarchical image database [C]. 2009 IEEE conference on computer vision and pattern recognition, IEEE, 2009: 248-255.
[51] YANG S, LUO P, LOY C-C, et al.. Wider face: A face detection benchmark [C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 5525-5533.
[52] JAIN V, LEARNED-MILLER E. FDDB: A benchmark for face detection in unconstrained settings [C]. Computer Science, 2010.
[53] FELZENSZWALB P, GIRSHICK R, MCALLE-STER D, et al.. Discriminatively trained mixtures of deformable part models [J].PASCAL VOC Challenge, 2008.
[54] DOLLAR P, WOJEK C, SCHIELE B, et al.. Pedestrian detection: An evaluation of the state of the art [J]. IEEE Transactions on Pattern Analysis Machine Intelligence, 2011, 34(4): 743-761.
[55] ZHANG S, BENENSON R, SCHIELE B. Citypersons: A diverse dataset for pedestrian detection [C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017: 3213-3221.
[56] [57] DALAL N, TRIGGS B. Histograms of oriented gradients for human detection [C]. 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR′05), San Diego, CA, USA, 2005, 1: 886-893.
GEIGER A, LENZ P, STILLER C, et al.. Vision meets robotics: The KITTI dataset [J]. The International Journal of Robotics Research, 2013, 32(11): 1231-1237.
[57] ESS A, LEIBE B, VANGOOL L. Depth and appearance for mobile scene analysis [C]. 2007 IEEE 11th International Conference on Computer Vision, IEEE, 2007: 1-8.
[58] 刘晓, 崔光照, 李正周, 等. 基于视觉系统分层的小目标运动检测 [J].光学 精密工程, 2019, 27(10): 2251-2262.
[59] SHRIVASTAVA A, GUPTA A, GIRSHICK R. Training region-based object detectors with online hard example mining [C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 761-769.
[60] KONG T, SUN F, YAO A, et al.. Ron: Reverse connection with objectness prior networks for object detection [C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017: 5936-5944.
[61] XIANG Y, CHOI W, LIN Y, et al.. Subcategory-aware convolutional neural networks for object proposals and detection [C]. 2017 IEEE Winter Conference on Applications of Computer Vision (WACV), IEEE, 2017: 924-933.
[62] LIN T-Y, DOLLAR P, GIRSHICK R, et al.. Feature pyramid networks for object detection [C]. Proceedings of the IEEE conference on computer vision and pattern recognition, 2017: 2117-2125.
[63] GOODFELLOW L, POUGET-ABADIE J, MIRZA M, et al.. Generative adversarial nets [C]. Advances in Neural Information Processing Systems, 2014: 2672-2680.
[64] 梁浩, 刘克俭, 刘康, 等. 引入再检测机制的孪生神经网络目标跟踪 [J].光学 精密工程, 2019, 27(7): 1621-1631.
[65] HUANG J, RATHOD V, SUN C, et al.. Speed/accuracy trade-offs for modern convolutional object detectors [C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017: 7310-7311.
[66] TOME D, BONDI L, BAROFFIO L, et al.. Reduced memory region based deep Convolutional Neural Network detection [C]. 2016 IEEE 6th International Conference on Consumer Electronics-Berlin (ICCE-Berlin), IEEE, 2016: 15-19.
范丽丽, 赵宏伟, 赵浩宇, 胡黄水, 王振. 基于深度卷积神经网络的目标检测研究综述[J]. 光学 精密工程, 2020, 28(5): 1152. FAN Li-li, ZHAO Hong-wei, ZHAO Hao-yu, HU Huang-shui, WANG Zhen. Survey of target detection based on deep convolutional neural networks[J]. Optics and Precision Engineering, 2020, 28(5): 1152.