Hybrid Speech Emotion Recognition System Using Machine Learning and Natural Language Processing

Authors

  • Sachinkumar Associate Professor, Department of CSE, Jain College of Engineering, Belagavi (Karnataka), INDIA
  • Ryan Dias Associate Professor, Dept of CSE, KLE Technological University, Belagavi (Karnataka), INDIA
  • Mahesh B Neelagar Assistant Professor, Department of E&CE, VTU Belagavi, (Karnataka), INDIA
  • Ashwin Patil R K Assistant Professor, Department of Computer Science and Engineering, Sir M Visvesvaraya College of Engineering, Raichur, VTU Belagavi, (Karnataka), INDIA
  • Vishwanath P Professor, Department of ECE, H.K.E. Society's Sir M Visvesvaraya College of Engineering, Raichur (Karnataka), INDIA (Affiliated VTU, Belagavi)
  • Sangamesh H Assistant Professor, Department of ECE, H.K.E. Society’s Sir M Visvesvaraya college of Engineering, Raichur (Karnataka), INDIA (Affiliated to VTU Belagavi)

Keywords:

SER, Natural Language Processing, Machine Learning. RAVDESS, EMODB.

Abstract

Speech Emotion Recognition (SER) is an essential area of research aimed at enabling machines to detect and interpret human emotions, thereby improving human-computer interaction. This study introduces a hybrid Speech Emotion Recognition system that combines machine learning techniques with Natural Language Processing (NLP) to enhance emotion detection accuracy and robustness. The system integrates acoustic and linguistic features, where acoustic features such as Mel-Frequency Cepstral Coefficients (MFCCs), pitch, and energy capture vocal expressions of emotions, while linguistic features derived from the textual content are analyzed using sentiment analysis, semantic embeddings, and syntactic patterns. By fusing these complementary feature sets, the proposed hybrid system overcomes the limitations of unimodal approaches and delivers a comprehensive analysis of emotional states. Machine learning algorithms such as Support Vector Machines (SVM), Random Forest, and Gradient Boosting are employed for classification, optimizing the system's ability to handle diverse emotional cues. The hybrid system was validated using benchmark SER datasets and outperformed traditional methods in terms of accuracy, particularly in noisy and cross-lingual scenarios. The results demonstrate the synergy of combining machine learning with NLP for recognizing emotions embedded in speech, highlighting its potential applications in domains such as virtual assistants, mental health monitoring, and customer service automation

Downloads

Published

2024-11-21

How to Cite

Sachinkumar, Ryan Dias, Mahesh B Neelagar, Ashwin Patil R K, Vishwanath P, & Sangamesh H. (2024). Hybrid Speech Emotion Recognition System Using Machine Learning and Natural Language Processing. Journal of Computational Analysis and Applications (JoCAAA), 33(08), 2475–2484. Retrieved from https://eudoxuspress.com/index.php/pub/article/view/2131

Issue

Section

Articles

Similar Articles

1 2 3 4 5 6 7 8 9 10 > >> 

You may also start an advanced similarity search for this article.