期刊论文

【摘要】

The paper presents a multi-modal emotion recognition system exploiting audio and video (i.e., facial expression) information. The system first processes both sources of information individually to produce corresponding matching scores and then combines the computed matching scores to obtain a classification decision. For the video part of the system, a novel approach to emotion recognition, relying on image-set matching, is developed. The proposed approach avoids the need for detecting and tracking specific facial landmarks throughout the given video sequence, which represents a common source of error in video-based emotion recognition systems, and, therefore, adds robustness to the video processing chain. The audio part of the system, on the other hand, relies on utterance-specific Gaussian Mixture Models (GMMs) adapted from a Universal Background Model (UBM) via the maximum a posteriori probability (MAP) estimation. It improves upon the standard UBM-MAP procedure by exploiting gender information when bu...

【授权许可】

CC BY

【预览】

附件列表
Files	Size	Format	View
RO201902183646461ZK.pdf	4752KB	PDF	download

International Journal of Advanced Robotic Systems
Towards Efficient Multi-Modal Emotion Recognition


关键词: Emotion Recognition; Video Processing; Speech Processing; Canonical Correlations; GMM-UBM;
DOI : 10.5772/54002
学科分类：自动化工程
来源: InTech
PDF


	文献评价指标
	下载次数：25次	浏览次数：22次

【 摘 要 】

【 授权许可】

【 预 览 】

【摘要】

【授权许可】

【预览】