CFP last date
28 April 2025
Call for Paper
May Edition
JAAI solicits high quality original research papers for the upcoming May edition of the journal. The last date of research paper submission is 28 April 2025

Submit your paper
Know more
Reseach Article

Facial Expression Recognition using Squeeze and Excitation-powered Swin Transformers

by Arpita Vats, Aman Chadha
Journal of Advanced Artificial Intelligence
Foundation of Computer Science (FCS), NY, USA
Volume 1 - Number 6
Year of Publication: 2025
Authors: Arpita Vats, Aman Chadha
10.5120/ijcajaai202428

Arpita Vats, Aman Chadha . Facial Expression Recognition using Squeeze and Excitation-powered Swin Transformers. Journal of Advanced Artificial Intelligence. 1, 6 ( Mar 2025), 15-21. DOI=10.5120/ijcajaai202428

@article{ 10.5120/ijcajaai202428,
author = { Arpita Vats, Aman Chadha },
title = { Facial Expression Recognition using Squeeze and Excitation-powered Swin Transformers },
journal = { Journal of Advanced Artificial Intelligence },
issue_date = { Mar 2025 },
volume = { 1 },
number = { 6 },
month = { Mar },
year = { 2025 },
pages = { 15-21 },
numpages = {9},
url = { https://jaaionline.phdfocus.com/archives/volume1/number6/facial-expression-recognition-using-squeeze-and-excitation-powered-swin-transformers/ },
doi = { 10.5120/ijcajaai202428 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2025-04-01T01:51:21.689635+05:30
%A Arpita Vats
%A Aman Chadha
%T Facial Expression Recognition using Squeeze and Excitation-powered Swin Transformers
%J Journal of Advanced Artificial Intelligence
%V 1
%N 6
%P 15-21
%D 2025
%I Foundation of Computer Science (FCS), NY, USA
Abstract

The ability to recognize and interpret facial emotions is a critical component of human communication, as it allows individuals to understand and respond to emotions conveyed through facial expressions and vocal tones. The recognition of facial emotions is a complex cognitive process that involves the integration of visual and auditory information, as well as prior knowledge and social cues. It plays a crucial role in social interaction, affective processing, and empathy, and is an important aspect of many realworld applications, including human-computer interaction, virtual assistants, and mental health diagnosis and treatment. The development of accurate and efficient models for facial emotion recognition is therefore of great importance and has the potential to have a significant impact on various fields of study.The field of Facial Emotion Recognition (FER) is of great significance in the areas of computer vision and artificial intelligence, with vast commercial and academic potential in fields such as security, advertising, and entertainment. We propose a FER framework that employs Swin Vision Transformers (SwinT) and squeeze and excitation block (SE) to address vision tasks. The approach uses a transformer model with an attention mechanism, SE, and SAM to improve the efficiency of the model, as transformers often require a large amount of data. Our focus was to create an efficient FER model based on SwinT architecture that can recognize facial emotions using minimal data. We trained our model on a hybrid dataset and evaluated its performance on the AffectNet dataset, achieving an F1-score of 0.5420, which surpassed the winner of the Affective Behavior Analysis in the Wild (ABAW) Competition held at the European Conference on Computer Vision (ECCV) 2022 [10].

References
  1. Mouath Aouayeb, Wassim Hamidouche, Catherine Soladie, Kidiyo Kpalma, and Renaud Seguier. Learning vision trans-former with squeeze and excitation for facial expression recognition, 2021. 6
  2. Xiangning Chen, Cho-Jui Hsieh, and Boqing Gong. When vision transformers outperform resnets without pre-training or strong data augmentations, 2021. 6
  3. Phan Tran Dac Thinh, Hoang Manh Hung, Hyung-Jeong Yang, Soo-Hyung Kim, and Guee-Sang Lee. Emotion recognition with sequential multi-task learning technique. In 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), pages 3586–3589, 2021. 2
  4. Charles Darwin. The Expression of the Emotions in Man and Animals. Cambridge Library Collection Darwin, Evolution and Genetics. Cambridge University Press, 2013. 1
  5. Didan Deng, Zhaokang Chen, and Bert Shi. Multitask emotion recognition with incomplete labels. pages 592–599, 11 2020. 2
  6. Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations, 2021. 1
  7. Pierre Foret, Ariel Kleiner, Hossein Mobahi, and Behnam Neyshabur. Sharpness-aware minimization for efficiently improving generalization, 2020.
  8. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition, 2015. 2
  9. Jie Hu, Li Shen, Samuel Albanie, Gang Sun, and Enhua Wu. Squeeze-and-excitation networks, 2017. 6
  10. Dimitrios Kollias. Abaw:learning from synthetic data and multi-task learning challenges, 2022. 1
  11. Felix Kuhnke, Lars Rumberg, and Jorn Ostermann. Twostream aural-visual affect analysis in the wild. In 2020 15th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020). IEEE, nov 2020. 2
  12. Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. Swin transformer: Hierarchical vision transformer using shifted windows, 2021. 3, 6
  13. S. Minaee, Y. Boykov, F. Porikli, A. Plaza, N. Kehtarnavaz, and D. Terzopoulos. Image segmentation using deep learning: A survey. IEEE Transactions on Pattern Analysis, 44, 2022. 5
  14. Ali Mollahosseini, Behzad Hasani, and Mohammad H. Mahoor. AffectNet: A database for facial expression, valence, and arousal computing in the wild. IEEE Transactions on Affective Computing, jan 2019.
  15. Andrey V. Savchenko, Lyudmila V. Savchenko, and Ilya Makarov. Classifying emotions and engagement in online learning based on a single facial expression recognition neural network. IEEE Transactions on Affective Computing, pages 1–12, 2022. 3
  16. Y.-I. Tian, T. Kanade, and J.F. Cohn. Recognizing action units for facial expression analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(2):97–115, 2001. 1
  17. Zhengyao Wen, Wenzhong Lin, Tao Wang, and Ge Xu. Distract your attention: Multi-head cross attention network for facial expression recognition, 2021.
  18. Hanzhong Zhang, Jibin Yin, and Xiangliang Zhang. The study of a five-dimensional emotional model for facial emotion recognition. Mobile Information Systems, 2020:1–10, 12 2020.
Index Terms

Computer Science
Information Sciences

Keywords

SAM Swin-T Squeeze and Excitation Emotion Recognition