时空特征融合深度学习网络人体行为识别方法

基于自然场景图像的人体行为识别方法中遮挡、背景干扰、光照不均匀等因素影响识别结果, 利用人体三维骨架序列的行为识别方法可以克服上述缺点。首先, 考虑人体行为的时空特性, 提出一种时空特征融合深度学习网络人体骨架行为识别方法; 其次, 根据骨架几何特征建立视角不变性特征表示, CNN(Convolutional Neural Network)网络学习骨架的局部空域特征, 作用于空域的LSTM(Long Short Term Memory)网络学习骨架空域节点之间的相关性特征, 作用于时域的LSTM网络学习骨架序列时空关联性特征; 最后, 利用NTU RGB+D数据库验证文中算法。实验结果表明: 算法识别精度有所提高, 对于多视角骨架具有较强的鲁棒性。

Abstract

Action recognition from natural scene was affected by complex illumination conditions and cluttered backgrounds. There was a growing interest in solving these problems by using 3D skeleton data. Firstly, considering the spatio-temporal features of human actions, a spatio-temporal fusion deep learning network for action recognition was proposed; Secondly, view angle invariant character was constructed based on geometric features of the skeletons. Local spatial character was extracted by short-time CNN networks. A spatio-LSTM network was used to learn the relation between joints of a skeleton frame. Temporal LSTM was used to learn spatio-temporal relation between skeleton sequences. Lastly, NTU RGB+D datasets were used to evaluate this network, the network proposed achieved the state-of-the-art performance for 3D human action analysis. Experimental results show that this network has strong robustness for view invariant sequences.

参考文献

[1] Wang Jiang, Liu Zicheng. Mining actionlet ensemble for action recognition with depth cameras[C]//IEEE Conference on Computer Vision and Pattern Recognition,2012: 1290-1297.

[2] Luvizon D C, Tabia H. Learning features combination for human action recognition from skeleton sequences[J]. Pattern Recognition Letters, 2017, 99(11): 13-20.

[3] Ji XiaoPeng, Cheng Jun. The spatial laplacian and temporal energy pyramid representation for human action recognition using depth sequence[J]. Knowledge-Based System, 2017, 122: 64-74.

[4] Zhang Pengfei, Lan Cuiling. View adaptive recurrent neural networks for high performance human action recognition from skeleton data[C]//ICCV 2017. International Conference on Computer Vision, 2017: 2136-2145.

[5] Du Yong, Wang Wei, Wang Liang. Hierarchical recurrent neural network for skeleton based action recognition[C]//IEEE Conference on Computer Vision and Pattern Recognition, 2015: 1110-1118.

[6] Vivek Veeriah, Naifan Zhuang. Differential recurrent neural networks for action recognition[C]//IEEE Conference on Computer Vision and Pattern Recognition, 2015: 4041-4049.

[7] Zhu Wentao, Lan Cuiling. Co-occurrence feature learning for skeleton based action recognition using regularized deep LSTM networks[C]//AAAI, 2016: 3697-3704.

[8] Amir Shahroudy, Liu Jun. NTU RGB+D: A large scale dataset for 3D human activity analysis[C]//IEEE Conference on Computer Vision and Pattern Recognition, 2016: 1010-1019.

[9] Liu Jun, Amir Shahroudy. Spatio-temporal LSTM with trust gates for 3D human action recognition[C]//IEEE Conference on Computer Vision and Pattern Recognition, 2016.

[10] Liu Jun, Wang Gang. Skeleton based human action recognition with global context-aware attention LSTM networks[C]//IEEE Conference on Computer Vision and Pattern Recognition, 2017: 3671-3680.

[11] Huang Zhiwu, Wan Chengde. Deep learning on Lie groups for skeleton-based action recognitio[C]//IEEE Conference on Computer Vision and Pattern Recognition, 2017: 1243-1252.

[12] 周飞燕, 金林鹏, 董军. 卷积神经网络研究综述[J]. 计算机学报, 2017, 40(6): 1229-1250.

Zhou Feiyan, Jin Linpeng, Dong Jun. Review of convolutional neural networks[J]. Chinese Journal of Computers, 2017, 40(6): 1229-1250. (in Chinese)

[13] 罗海波, 许凌云, 惠斌, 等. 基于深度学习的目标跟踪方法研究现状与展望[J]. 红外与激光工程, 2017, 46(5): 0502002.

Luo Haibo, Xu Lingyun, Hui Bin, et al. Status and prospect of target tracking based on deep learning[J]. Infrared and Laser Engineering, 2017, 46(5): 0502002. (in Chinese)

[14] 绍春艳, 丁庆海, 罗海波, 等. 采用高维数据聚类的目标跟踪[J]. 红外与激光工程, 2016, 45(4): 0428002.

Shao Chunyan, Ding Qinghai, Luo Haibo, et al. Target tracking using high-dimension data clustering[J]. Infrared and Laser Engineering, 2016, 45(4): 0428002. (in Chinese)

裴晓敏, 范慧杰, 唐延东. 时空特征融合深度学习网络人体行为识别方法[J]. 红外与激光工程, 2018, 47(2): 0203007. Pei Xiaomin, Fan Huijie, Tang Yandong. Action recognition method of spatio-temporal feature fusion deep learning network[J]. Infrared and Laser Engineering, 2018, 47(2): 0203007.

时空特征融合深度学习网络人体行为识别方法

关于本站 Cookie 的使用提示

全站搜索

时空特征融合深度学习网络人体行为识别方法

相关论文

相关资讯

关于本站 Cookie 的使用提示

全站搜索