期刊论文详细信息
Frontiers in Neurorobotics
TFE: A Transformer Architecture for Occlusion Aware Facial Expression Recognition
Jixun Gao1  Yuanyuan Zhao2 
[1] Department of Computer Science, Henan University of Engineering, Zhengzhou, China;Department of Computer Science, Zhengzhou University of Technology, Zhengzhou, China;
关键词: affective computing;    facial expression recognition;    occlusion;    transformer;    deep learning;   
DOI  :  10.3389/fnbot.2021.763100
来源: DOAJ
【 摘 要 】

Facial expression recognition (FER) in uncontrolled environment is challenging due to various un-constrained conditions. Although existing deep learning-based FER approaches have been quite promising in recognizing frontal faces, they still struggle to accurately identify the facial expressions on the faces that are partly occluded in unconstrained scenarios. To mitigate this issue, we propose a transformer-based FER method (TFE) that is capable of adaptatively focusing on the most important and unoccluded facial regions. TFE is based on the multi-head self-attention mechanism that can flexibly attend to a sequence of image patches to encode the critical cues for FER. Compared with traditional transformer, the novelty of TFE is two-fold: (i) To effectively select the discriminative facial regions, we integrate all the attention weights in various transformer layers into an attention map to guide the network to perceive the important facial regions. (ii) Given an input occluded facial image, we use a decoder to reconstruct the corresponding non-occluded face. Thus, TFE is capable of inferring the occluded regions to better recognize the facial expressions. We evaluate the proposed TFE on the two prevalent in-the-wild facial expression datasets (AffectNet and RAF-DB) and the their modifications with artificial occlusions. Experimental results show that TFE improves the recognition accuracy on both the non-occluded faces and occluded faces. Compared with other state-of-the-art FE methods, TFE obtains consistent improvements. Visualization results show TFE is capable of automatically focusing on the discriminative and non-occluded facial regions for robust FER.

【 授权许可】

Unknown   

  文献评价指标  
  下载次数:0次 浏览次数:2次