基于神经网络与卡尔曼滤波的手部实时追踪方法

针对传统的手部跟踪算法存在实时性差、识别精度低、易受环境影响等问题, 提出了一种基于神经网络与卡尔曼滤波的手部实时追踪方法。该方法首先通过神经网络对视频中出现的检测目标进行定位, 接着用卡尔曼滤波对目标运动进行估计, 将估计的结果与下一帧图像中检测到的目标进行比对; 然后对检测到的目标进行跟踪, 将手部运动的轨迹实时显示。实验结果表明, 该方法能够对多个手部目标实时进行追踪, 并在手部运动过程中出现交叉和形变的情况下还能保持跟踪, 其平均处理帧数为21.212 f/s, 追踪正确率为94.88%,基本满足手部跟踪的稳定可靠、高实时性、高鲁棒性等要求。

Abstract

Aiming at the problems of poor hand-time tracking, low recognition accuracy and environmental impact, a hand tracking method based on neural network and Kalman filtering is proposed. The method firstly locates the detection target appearing in the video through the neural network, then estimates the target motion by Kalman filter, compares the estimated result with the detected target in the next frame image, and finally detects the target, tracking and displaying the trajectory of the hand movement in real time. Experiments show that the method can track multiple hand targets in real time and keep tracking when the cross and deformation occur during hand movement. The average processing frame number is 21.212 f/s, and the tracking accuracy rate is 94.88%. It basically meets the requirements of stable, reliable, high real-time and high robustness of hand tracking.

参考文献

[1] WANG Q, CHEN X L, ZHANG L G, et al. Viewpoint invariant sign language recognition[C] //Proceedings of 2005 IEEE International Conference on Image Processing. Genova, Italy: IEEE, 2005: 274-281.

[2] NG C W, RANGANATH S. Real-time gesture recognition system and application[J]. Image and Vision Computing,2002, 20(13-14): 993-1007.

[3] LIPTON A J, FUJIYOSHI H, PATIL R S. Moving target classification and tracking from real-time video[C] //Proceedings of the 4th IEEE Workshop on Applications of Computer Vision. Princeton, NJ, USA, USA: IEEE,1998: 8-14.

[4] 林开颜, 吴军辉, 徐立鸿．彩色图像分割方法综述[J]．中国图象图形学报, 2005, 10(1): 1-10．LIN K Y, WU J H, XU L H. A survey on color image segmentation techniques[J]. Journal of Image and Graphics,2005, 10(1): 1-10. (in Chinese)

[5] 吴秋红, 吴谨, 朱磊, 等．基于图论和FCM的图像分割算法[J]．液晶与显示, 2016, 31(1): 112-117．WU Q H, WU J, ZHU L, et al. Image segmentation algorithm based on graph theory and FCM[J]. Chinese Journal of Liquid Crystals and Displays,2016, 31(1): 112-117. (in Chinese)

[6] PEURSUM P, VENKATESH S, WEST G. A study on smoothing for particle-filtered 3D human body tracking[J]. International Journal of Computer Vision, 2010, 87(1-2): 53-74.

[7] SHAMAIE A, SUTHERLAND A. A dynamic model for real-time tracking of hands in bimanual movements[C] //Proceedings of the 5th International Gesture Workshop. Genova, Italy: Springer, 2004: 172-179.

[8] REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[C] //Proceedings of the 28th International Conference on Neural Information Processing Systems. Montreal, Canada: MIT Press, 2015: 91-99.

[9] REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA: IEEE, 2016: 779-788.

[10] 方明, 孙腾腾, 邵桢. 基于改进YOLOv2的快速安全帽佩戴情况检测[J]. 光学精密工程, 2019, 27(5): 1196-1205.FANG M, SUN T T, SHAO Z. Fast helmet-wearing-condition detection based on improved YOLOv2[J]. Optics and Precision Engineering, 2019, 27(5): 1196-1205. (in Chinese)

[11] 潘蓉, 孙伟. 基于预分割和回归的深度学习目标检测[J]. 光学精密工程, 2017, 25(10s): 221-227. PAN R, SUN W. Deep learning target detection based on pre-segmentation and regression[J]. Optics and Precision Engineering, 2017, 25(10s): 221-227. (in Chinese)

[12] LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot MultiBox detector[C] //Proceedings of the 14th European Conference on Computer Vision. Amsterdam, The Netherlands: Springer, 2016: 21-37.

[13] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[J]. ArXiv, 2014, 1409.1556.

[14] YU F, KOLTUN V. Multi-scale context aggregation by dilated convolutions[J]. ArXiv, 2015, 1511.07122.

[15] BAMBACH S, LEE S, CRANDALL D J, et al. Lending a hand: detecting hands and recognizing activities in complex egocentric interactions[C] //Proceedings of the IEEE International Conference on Computer Vision. Santiago, Chile: IEEE, 2015: 1949-1957.

[16] ZHAO L M, LI X, ZHUANG Y T, et al. Deeply-learned part-aligned representations for person re-identification[C] //Proceedings of the IEEE International Conference on Computer Vision. Venice, Italy: IEEE, 2017: 3239-3248.

曾公任, 姚剑敏, 严群, 林志贤, 郭太良, 林畅. 基于神经网络与卡尔曼滤波的手部实时追踪方法[J]. 液晶与显示, 2020, 35(5): 464. ZENG Gong-ren, YAO Jian-min, YAN Qun, LIN Zhi-xian, GUO Tai-liang, LIN Chang. Hand real-time tracking method based on neural network and Kalman filter[J]. Chinese Journal of Liquid Crystals and Displays, 2020, 35(5): 464.

基于神经网络与卡尔曼滤波的手部实时追踪方法

关于本站 Cookie 的使用提示

全站搜索

基于神经网络与卡尔曼滤波的手部实时追踪方法

相关论文

相关资讯

关于本站 Cookie 的使用提示

全站搜索