基于嵌入注意力机制层级LSTM的音视频情感识别

刘天宝; 张凌涛; 于文涛; 魏东川; 范轶军

doi:doi:10.3788/LOP202158.0210017

激光与光电子学进展, 2021, 58 (2): 0210017, 网络出版: 2021-01-11

基于嵌入注意力机制层级LSTM的音视频情感识别下载： 1559次

Hierarchical LSTM-Based Audio and Video Emotion Recognition With Embedded Attention Mechanism

刘天宝张凌涛 ^*于文涛魏东川范轶军

作者单位

中南林业科技大学计算机与信息工程学院, 湖南长沙 410004

AI 词云图 AI一句话精读 AI短摘要

注：本部分内容由 AI 自动生成，请您知悉。

摘要

对于语音的情感识别,针对单层长短期记忆(LSTM)网络在解决复杂问题时的泛化能力不足,提出一种嵌入自注意力机制的堆叠LSTM模型,并引入惩罚项来提升网络性能。对于视频序列的情感识别,引入注意力机制,根据每个视频帧所包含情感信息的多少为其分配权重后再进行分类。最后利用加权决策融合方法融合表情和语音信号,实现最终的情感识别。实验结果表明,与单模态情感识别相比,所提方法在所选数据集上的识别准确率提升4%左右,具有较好的识别结果。

Abstract

A single-layer long short term memory (LSTM) network is not generalizable to solve complex speech emotion recognition problems. Therefore, a hierarchical LSTM model with a self-attention mechanism is proposed. Penalty items are introduced to improve network performance. For the emotion recognition of video sequences, the attention mechanism is introduced to assign a weight to each video frame according to its emotional information and then classify these frames. The weighted decision fusion method is used to fuse expressions and speech signals to achieve the final emotion recognition. The experimental results demonstrate that compared with single-modal emotion recognition, the recognition accuracy of the proposed method on the selected data is improved by approximately 4%, thus the proposed method has a better recognition results.

PDF全文

刘天宝, 张凌涛, 于文涛, 魏东川, 范轶军. 基于嵌入注意力机制层级LSTM的音视频情感识别[J]. 激光与光电子学进展, 2021, 58(2): 0210017. Tianbao Liu, Lingtao Zhang, Wentao Yu, Dongchuan Wei, Yijun Fan. Hierarchical LSTM-Based Audio and Video Emotion Recognition With Embedded Attention Mechanism[J]. Laser & Optoelectronics Progress, 2021, 58(2): 0210017.

基于嵌入注意力机制层级LSTM的音视频情感识别下载： 1559次

关于本站 Cookie 的使用提示

全站搜索

基于嵌入注意力机制层级LSTM的音视频情感识别 下载： 1559次

相关论文

相关资讯

关于本站 Cookie 的使用提示

全站搜索

基于嵌入注意力机制层级LSTM的音视频情感识别下载： 1559次