Dense-Attention CNN with Spatial-Attention Fusion for Robust Facial Expression Recognition
DOI:
https://doi.org/10.5755/j01.itc.54.3.41283Keywords:
Expression recognition, Deep learning, Convolutional neural network, Attentional mechanismAbstract
Currently, facial expression recognition technology has been gradually applied in fields such as intelligent healthcare, online education, and assisted driving. However, traditional Convolutional Neural Network (CNN) lack attention to facial local regions related to emotions, and classic loss functions cannot handle intra-class variability in facial expressions. This paper establishes a facial expression recognition model combining deep learning and attention mechanisms for both static and dynamic facial expressions. By extracting image features, it obtains rich multi-scale information flow and controls the number of model parameters. It constructs a spatial attention unit to focus on information with significant emotional intensity, and combines an intra-class distance penalty term and classification loss to supervise the network learning process. This approach addresses the issue of CNN paying insufficient attention to regions of interest while reducing the variability among facial expressions of the same class. Experimental results show that the accuracy of this model has increased by 1.1% and 2.7% on the CK+ and FER2013 public datasets, respectively.
Downloads
Published
Issue
Section
License
Copyright terms are indicated in the Republic of Lithuania Law on Copyright and Related Rights, Articles 4-37.