Speech Emotion Recognition using MFCC and Hybrid Neural Networks

Youakim Badr, Partha Mukherjee, Sindhu Madhuri Thumati

Research output: Chapter in Book/Report/Conference proceedingConference contribution

8 Scopus citations

Abstract

Speech emotion recognition is a challenging task and feature extraction plays an important role in effectively classifying speech into different emotions. In this paper, we apply traditional feature extraction methods like MFCC for feature extraction from audio files. Instead of using traditional machine learning approaches like SVM to classify audio files, we investigate different neural network architectures. Our baseline model implemented as a convolutional neural network results in 60% classification accuracy. We propose a hybrid neural network architecture based on Convolutional and Long Short-Term Memory (ConvLSTM) networks to capture spatial and sequential information of audio files. Our experimental results show that our ComvLSTM model has achieved an accuracy of 59%. We improved our model with data augmentation techniques and re-trained it with augmented dataset. The classification accuracy achieves 91% for multi-class classification of RAVDESS dataset outperforming the accuracy of state-of-the-art multi-class classification models that used the similar data.

Original languageEnglish (US)
Title of host publicationIJCCI 2021 - Proceedings of the 13th International Joint Conference on Computational Intelligence
EditorsThomas Back, Christian Wagner, Jonathan Garibaldi, H. K. Lam, Marie Cottrell, Juan Julian Merelo, Kevin Warwick
PublisherScience and Technology Publications, Lda
Pages366-373
Number of pages8
ISBN (Electronic)9789897585340
StatePublished - 2021
Event13th International Joint Conference on Computational Intelligence, IJCCI 2021 - Virtual, Online
Duration: Oct 25 2021Oct 27 2021

Publication series

NameICETE International Conference on E-Business and Telecommunication Networks (International Joint Conference on Computational Intelligence)
Volume2021-October
ISSN (Print)2184-2825

Conference

Conference13th International Joint Conference on Computational Intelligence, IJCCI 2021
CityVirtual, Online
Period10/25/2110/27/21

All Science Journal Classification (ASJC) codes

  • Electrical and Electronic Engineering
  • Computer Networks and Communications
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'Speech Emotion Recognition using MFCC and Hybrid Neural Networks'. Together they form a unique fingerprint.

Cite this