激光与光电子学进展, 2020, 57 (18): 181506, 网络出版: 2020-09-02   

基于时空交互注意力模型的人体行为识别算法 下载: 1049次

Human Action Recognition Algorithm Based on Spatio-Temporal Interactive Attention Model
作者单位
江南大学江苏省模式识别与计算智能工程实验室, 江苏 无锡 214122
引用该论文

潘娜, 蒋敏, 孔军. 基于时空交互注意力模型的人体行为识别算法[J]. 激光与光电子学进展, 2020, 57(18): 181506.

Na Pan, Min Jiang, Jun Kong. Human Action Recognition Algorithm Based on Spatio-Temporal Interactive Attention Model[J]. Laser & Optoelectronics Progress, 2020, 57(18): 181506.

参考文献

[1] SimonyanK, ZissermanA. Two-stream convolutional networks for action recognition in videos[C]∥Advances in neural information processing systems, December 8-13, 2014, Montreal, Quebec, Canada: Curran Associates, Inc., 2014: 568- 576.

[2] Wang LM, Xiong YJ, WangZ, et al.Temporal segment networks: towards good practices for deep action recognition[M] ∥Computer Vision-ECCV 2016. Cham: Springer International Publishing, 2016: 20- 36.

[3] CarreiraJ, ZissermanA. Quo vadis, action recognition? A new model and the kinetics dataset[C]∥2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 21-26 July 2017, Honolulu, HI, USA.New York: IEEE Press, 2017: 4724- 4733.

[4] MnihV, HeessN, GravesA, et al. Recurrent models of visual attention[C]∥NIPS'14: Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2.2014: 2204- 2212.

[5] Fan LF, Chen YX, WeiP, et al.Inferring shared attention in social scene videos[C]∥2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 18-23 June 2018, Salt Lake City, UT, USA.New York: IEEE Press, 2018: 6460- 6468.

[6] Lu M L, Li Z N, Wang Y M, et al. Deep attention network for egocentric action recognition[J]. IEEE Transactions on Image Processing, 2019, 28(8): 3703-3713.

[7] FuJ, LiuJ, Tian HJ, et al.Dual attention network for scene segmentation[C]∥2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 15-20 June 2019, Long Beach, CA, USA.New York: IEEE Press, 2019: 3141- 3149.

[8] 朱铭康, 卢先领. 基于Bi-LSTM-Attention模型的人体行为识别算法[J]. 激光与光电子学进展, 2019, 56(15): 151503.

    Zhu M K, Lu X L. Human action recognition algorithm based on Bi-LSTM-attention model[J]. Laser & Optoelectronics Progress, 2019, 56(15): 151503.

[9] Tang YS, TianY, Lu JW, et al.Deep progressive reinforcement learning for skeleton-based action recognition[C]∥2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 18-23 June 2018, Salt Lake City, UT, USA.New York: IEEE Press, 2018: 5323- 5332.

[10] Jing L L, Yang X D, Tian Y L. Video You only look once: overall temporal convolutions for action recognition[J]. Journal of Visual Communication and Image Representation, 2018, 52: 58-65.

[11] Yu T Z, Guo C X, Wang L F, et al. Joint spatial-temporal attention for action recognition[J]. Pattern Recognition Letters, 2018, 112: 226-233.

[12] Lu L H, Di H J, Lu Y, et al. Spatio-temporal attention mechanisms based model for collective activity recognition[J]. Signal Processing: Image Communication, 2019, 74: 162-174.

[13] He KM, GkioxariG, DollárP, et al.Mask R-CNN[C]∥2017 IEEE International Conference on Computer Vision (ICCV). 22-29 Oct. 2017, Venice, Italy.New York: IEEE Press, 2017: 2980- 2988.

[14] Fan LJ, Huang WB, GanC, et al.End-to-end learning of motion representation for video understanding[C]∥2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 18-23 June 2018, Salt Lake City, UT, USA.New York: IEEE Press, 2018: 6016- 6025.

[15] Li Z Y, Gavrilyuk K, Gavves E, et al. Video LSTM convolves, attends and flows for action recognition[J]. Computer Vision and Image Understanding, 2018, 166: 41-50.

[16] Zhang J X, Hu H F. Deep spatiotemporal relation learning with 3D multi-level dense fusion for video action recognition[J]. IEEE Access, 2019, 7: 15222-15229.

[17] Khowaja S A, Lee S L. Hybrid and hierarchical fusion networks: a deep cross-modal learning architecture for action recognition[J]. Neural Computing and Applications, 2019: 1-12.

[18] WangH, SchmidC. Action recognition with improved trajectories[C]∥2013 IEEE International Conference on Computer Vision. 1-8 Dec. 2013, Sydney, NSW, Australia.New York: IEEE Press, 2013: 3551- 3558.

[19] Peng X J, Wang L M, Wang X X, et al. Bag of visual words and fusion methods for action recognition: comprehensive study and good practice[J]. Computer Vision and Image Understanding, 2016, 150: 109-125.

[20] Lan ZZ, LinM, Li XC, et al.Beyond Gaussian pyramid: multi-skip feature stacking for action recognition[C]∥2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 7-12 June 2015, Boston, MA, USA. New York: IEEE Press, 2015: 204- 212.

[21] ZhuY, Lan ZZ, NewsamS, et al.Hidden two-stream convolutional networks for action recognition[M] ∥Computer Vision-ACCV 2018. Cham: Springer International Publishing, 2019: 363- 378.

[22] Tu Z G, Xie W, Dauwels J, et al. Semantic cues enhanced multimodality multistream CNN for action recognition[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2019, 29(5): 1423-1437.

[23] TranA, Cheong LF. Two-stream flow-guided convolutional attention networks for action recognition[C]∥2017 IEEE International Conference on Computer Vision Workshops (ICCVW). 22-29 Oct. 2017, Venice, Italy.New York: IEEE Press, 2017: 3110- 3119.

[24] Du W B, Wang Y L, Qiao Y. Recurrent spatial-temporal attention network for action recognition in videos[J]. IEEE Transactions on Image Processing, 2018, 27(3): 1347-1360.

[25] Cao CQ, Zhang YF, Zhang CJ, et al. Action recognition with joints-pooled 3D deep convolutional descriptors[C]∥IJCAI'16: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence.2016: 3324- 3330.

[26] VillegasR, YangJ, ZouY, et al. Learning to generate long-term future via hierarchical prediction[C]∥Proceedings of the 34th International Conference on Machine Learning-Volume 70, Aug 6-11, 2017, Sydney, Australia: JMLR. org, 2017: 3560- 3569.

[27] Gao RH, XiongB, GraumanK. Im2flow: motion hallucination from static images for action recognition[C]∥2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 18-23 June 2018, Salt Lake City, UT, USA. New York: IEEE Press, 2018: 5937- 5947.

潘娜, 蒋敏, 孔军. 基于时空交互注意力模型的人体行为识别算法[J]. 激光与光电子学进展, 2020, 57(18): 181506. Na Pan, Min Jiang, Jun Kong. Human Action Recognition Algorithm Based on Spatio-Temporal Interactive Attention Model[J]. Laser & Optoelectronics Progress, 2020, 57(18): 181506.

本文已被 1 篇论文引用
被引统计数据来源于中国光学期刊网
引用该论文: TXT   |   EndNote

相关论文

加载中...

关于本站 Cookie 的使用提示

中国光学期刊网使用基于 cookie 的技术来更好地为您提供各项服务,点击此处了解我们的隐私策略。 如您需继续使用本网站,请您授权我们使用本地 cookie 来保存部分信息。
全站搜索
您最值得信赖的光电行业旗舰网络服务平台!