期刊论文

【摘要】

Emotion recognition plays an essential role in interpersonal communication. However, existing recognition systems use only features of a single modality for emotion recognition, ignoring the interaction of information from the different modalities. Therefore, in our study, we propose a global-aware Cross-modal feature Fusion Network (GCF2-Net) for recognizing emotion. We construct a residual cross-modal fusion attention module (ResCMFA) to fuse information from multiple modalities and design a global-aware module to capture global details. More specifically, we first use transfer learning to extract wav2vec 2.0 features and text features fused by the ResCMFA module. Then, cross-modal fusion features are fed into the global-aware module to capture the most essential emotional information globally. Finally, the experiment results have shown that our proposed method has significant advantages than state-of-the-art methods on the IEMOCAP and MELD datasets, respectively.

【授权许可】

【预览】

附件列表
Files	Size	Format	View
RO202310106389604ZK.pdf	1143KB	PDF	download

Frontiers in Neuroscience
GCF2-Net: global-aware cross-modal feature fusion network for speech emotion recognition
Neuroscience
Xiaoshuang Sang¹ Wei Liu¹ Lingling Wang¹ Jiusong Luo¹ Feng Li²
[1] Department of Computer Science and Technology, Anhui University of Finance and Economics, Anhui, China;Department of Computer Science and Technology, Anhui University of Finance and Economics, Anhui, China;School of Information Science and Technology, University of Science and Technology of China, Anhui, China;
关键词: speech emotion recognition; global-aware; feature fusion network; wav2vec 2.0; cross-modal;
DOI : 10.3389/fnins.2023.1183132
received in 2023-03-09, accepted in 2023-04-13, 发布年份 2023
来源: Frontiers
PDF


	文献评价指标
	下载次数：6次	浏览次数：0次

【 摘 要 】

【 授权许可】

【 预 览 】

【摘要】

【授权许可】

【预览】