激光与光电子学进展, 2020, 57 (24): 241506, 网络出版: 2020-12-01
基于双流快速区域卷积神经网络改进的人体动作识别算法 下载: 963次
Improved Human Action Recognition Algorithm Based on Two-Stream Faster Region Convolutional Neural Network
机器视觉 双流快速区域卷积神经网络 人体动作识别 压缩与激励 交并比损失函数 machine vision two-stream faster region convolutional neural network human action recognition squeeze and excitation intersection-over-union loss function
摘要
深度神经网络在静态图像领域已取得突破性进展,并逐步扩展到视频识别领域。人体动作识别是视频识别领域的研究热点和难点,因此,提出了一种基于双流快速区域卷积神经网络(Faster RCNN)改进的人体动作识别算法。首先,用RGB(Red,Green,Blue)图像和光流数据作为网络的输入,分别训练Faster RCNN;然后,将训练好后的网络模型进行融合,并引入改进的压缩和激励模块对特征通道进行处理,以突出重要特征;最后,用完全的交并比损失函数作为边框回归损失函数,以优化某些预测框与真实框不能相交等问题。实验结果表明,相比传统的Faster RCNN,本算法在动作识别数据集UCF101上的准确率得到了一定的提高。
Abstract
In the field of static image, deep neural networks have made breakthroughs and gradually expanded to the field of video recognition. Human action recognition is a research hotspot and difficult in the field of video recognition. Therefore, this paper proposes an improved human action recognition algorithm based on two-stream faster region convolutional neural network (Faster RCNN). First, we use RGB (Red, Green, Blue) images and optical flow data as input of the network to train the Faster RCNN separately; then, the trained network model is fused, and an improved squeeze and excitation block is introduced to process the feature channel to highlight important features; finally, we use the complete intersection-over-union loss function as the bounding box regression loss function to optimize some problems such as the inability to intersect the ground truth box with the predicted box. The experimental results show that the accuracy of the algorithm on the action recognition data set UCF101 is improved compared to the traditional Faster RCNN.
郭如意, 金杰, 刘高华, 刘凯燕, 姜诗祺. 基于双流快速区域卷积神经网络改进的人体动作识别算法[J]. 激光与光电子学进展, 2020, 57(24): 241506. Ruyi Guo, Jie Jin, Gaohua Liu, Kaiyan Liu, Shiqi Jiang. Improved Human Action Recognition Algorithm Based on Two-Stream Faster Region Convolutional Neural Network[J]. Laser & Optoelectronics Progress, 2020, 57(24): 241506.