International Journal of Advanced Robotic Systems | |
Towards Efficient Multi-Modal Emotion Recognition | |
关键词: Emotion Recognition; Video Processing; Speech Processing; Canonical Correlations; GMM-UBM; | |
DOI : 10.5772/54002 | |
学科分类:自动化工程 | |
来源: InTech | |
【 摘 要 】
The paper presents a multi-modal emotion recognition system exploiting audio and video (i.e., facial expression) information. The system first processes both sources of information individually to produce corresponding matching scores and then combines the computed matching scores to obtain a classification decision. For the video part of the system, a novel approach to emotion recognition, relying on image-set matching, is developed. The proposed approach avoids the need for detecting and tracking specific facial landmarks throughout the given video sequence, which represents a common source of error in video-based emotion recognition systems, and, therefore, adds robustness to the video processing chain. The audio part of the system, on the other hand, relies on utterance-specific Gaussian Mixture Models (GMMs) adapted from a Universal Background Model (UBM) via the maximum a posteriori probability (MAP) estimation. It improves upon the standard UBM-MAP procedure by exploiting gender information when bu...
【 授权许可】
CC BY
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
RO201902183646461ZK.pdf | 4752KB | download |