Design and Implementation of a Multi-Tier Deep Learning Framework for Robust Facial Emotion Recognition Using CNNs, Hybrid Boosting, and Vision Transformers

Authors

  • Ketan Sarvakar , Dr. Kaushikkumar Rana

Keywords:

Facial Emotion Recognition (FER); Class Balancing; Convolutional Neural Networks (CNNs: VGG16, VGG19, ResNet50, InceptionV3, MobileNet); Boosting Algorithms (AdaBoost, Gradient Boosting, XGBoost); Vision Transformer (ViT); Hybrid Deep Learning Models; Fixed Hyperparameters; FANE Dataset

Abstract

Facial Emotion Recognition (FER) faces challenges such as class imbalance, subtle variations, and limited model generalizability. This paper proposes a three-tier benchmark using the FANE dataset with nine emotion classes. We compare a rule-based Sequential model, five CNN architectures (VGG16, VGG19, ResNet50, InceptionV3, MobileNet), hybrid CNN + Boosting (AdaBoost, GB, XGBoost), and a custom Vision Transformer (ViT), all trained with fixed hyperparameters. Experiments on imbalanced and balanced datasets show that CNN + Boosting performs best post-balancing, while ViT benefits significantly from class balance. Results emphasize the value of standardization and architectural robustness in FER.

Downloads

Published

2024-06-19

How to Cite

Ketan Sarvakar , Dr. Kaushikkumar Rana. (2024). Design and Implementation of a Multi-Tier Deep Learning Framework for Robust Facial Emotion Recognition Using CNNs, Hybrid Boosting, and Vision Transformers. Journal of Computational Analysis and Applications (JoCAAA), 33(06), 2414–2419. Retrieved from https://eudoxuspress.com/index.php/pub/article/view/3374

Issue

Section

Articles