Stride Based Convolutional Neural Network for Speech Emotion Recognition

Wani, T. M.; Gunawan, T. S.; Qadri, S. A. A.; Mansor, H.; Arifin, F.; Ahmad, Y. A.

doi:10.1109/ICSIMA50015.2021.9526320

Speech Emotion Recognition (SER) recognizes the emotional features of speech signals regardless of semantic content. Deep Learning techniques have proven superior to conventional techniques for emotion recognition due to advantages such as speed and scalability and infinitely versatile operation. However, since emotions are subjective, there is no universal agreement on evaluating or categorizing them. The main objective of this paper is to design a suitable model of Convolutional Neural Network (CNN) – Stride-based Convolutional Neural Network (SCNN) by taking a smaller number of convolutional layers and eliminate the pooling-layers to increase computational stability. This elimination tends to increase the accuracy and decrease the computational time of the SER system. Instead of pooling layers, deep strides have been used for the necessary dimension reduction. SCNN is trained on spectrograms generated from the speech signals of two different databases, Berlin (Emo-DB) and IITKGP-SEHSC. Four emotions, angry, happy, neutral, and sad, have been considered for the evaluation process, and a validation accuracy of 90.67% and 91.33% is achieved for Emo-DB and IITKGPSEHSC, respectively. This study provides new benchmarks for both datasets, demonstrating the feasibility and relevance of the presented SER technique.

Stride Based Convolutional Neural Network for Speech Emotion Recognition / Wani, T. M.; Gunawan, T. S.; Qadri, S. A. A.; Mansor, H.; Arifin, F.; Ahmad, Y. A.. - (2021), pp. 41-46. (Intervento presentato al convegno 2021 IEEE 7th International Conference on Smart Instrumentation, Measurement and Applications (ICSIMA) tenutosi a Bandung; Indonesia) [10.1109/ICSIMA50015.2021.9526320].

Stride Based Convolutional Neural Network for Speech Emotion Recognition

Wani T. M.^{Primo

Writing – Original Draft Preparation};Gunawan T. S.;Qadri S. A. A.;Mansor H.;Arifin F.;Ahmad Y. A.

2021

Abstract

Speech Emotion Recognition (SER) recognizes the emotional features of speech signals regardless of semantic content. Deep Learning techniques have proven superior to conventional techniques for emotion recognition due to advantages such as speed and scalability and infinitely versatile operation. However, since emotions are subjective, there is no universal agreement on evaluating or categorizing them. The main objective of this paper is to design a suitable model of Convolutional Neural Network (CNN) – Stride-based Convolutional Neural Network (SCNN) by taking a smaller number of convolutional layers and eliminate the pooling-layers to increase computational stability. This elimination tends to increase the accuracy and decrease the computational time of the SER system. Instead of pooling layers, deep strides have been used for the necessary dimension reduction. SCNN is trained on spectrograms generated from the speech signals of two different databases, Berlin (Emo-DB) and IITKGP-SEHSC. Four emotions, angry, happy, neutral, and sad, have been considered for the evaluation process, and a validation accuracy of 90.67% and 91.33% is achieved for Emo-DB and IITKGPSEHSC, respectively. This study provides new benchmarks for both datasets, demonstrating the feasibility and relevance of the presented SER technique.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2021
			
	Nome convegno
	
				2021 IEEE 7th International Conference on Smart Instrumentation, Measurement and Applications (ICSIMA)
			
	Parole chiave
	
				Speech Emotion Recognition (SER); Stride-based Convolutional Neural Networks (SCNN); Strides, Spectrograms
			
	Tipologia
	
				04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
			
	Citazione
	
				Stride Based Convolutional Neural Network for Speech Emotion Recognition / Wani, T. M.; Gunawan, T. S.; Qadri, S. A. A.; Mansor, H.; Arifin, F.; Ahmad, Y. A.. - (2021), pp. 41-46. (Intervento presentato al  convegno 2021 IEEE 7th International Conference on Smart Instrumentation, Measurement and Applications (ICSIMA) tenutosi a Bandung; Indonesia) [10.1109/ICSIMA50015.2021.9526320].
			
	Appartiene alla tipologia:
	
				04b Atto di convegno in volume

File allegati a questo prodotto

File	Dimensione	Formato
Wani_Stride_2021.pdf solo gestori archivio Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 980.49 kB Formato Adobe PDF Contatta l'autore	980.49 kB	Adobe PDF	Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1714011

Citazioni

ND

8

ND

Catalogo dei prodotti della ricerca