This study explores the benefits of anti-transfer learning with quaternion neural networks for robust, effective, and efficient speech emotion recognition. Anti-transfer learning selectively promotes task invariance through the introduction of a deep feature loss at training time. It has been shown to improve the performance of speech emotion recognition models by encouraging the independence of emotion predictions from specific uttered words and characteristics of the speaker's voice. However, the improved accuracy comes at a cost of increased computation time and memory requirements. In order to reduce the resource demand of anti-transfer, we propose to exploit quaternion-valued processing. We design, implement, and evaluate the use of quaternion anti-transfer learning on the basis of the VGG16 architecture and quaternion embeddings on multiple datasets for different speech emotion recognition task setups. The effectiveness of this approach depends on the layer where it is applied, with early layers offering a good compromise between performance gain and resource requirements. Our results show that anti-transfer in the quaternion domain can enhance generalisation while reducing the model's demand for computation and memory.

Quaternion anti-transfer learning for speech emotion recognition / Guizzo, E.; Weyde, T.; Tarroni, G.; Comminiello, D.. - (2023), pp. 1-5. (Intervento presentato al convegno 2023 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, WASPAA 2023 tenutosi a New Paltz, NY; USA) [10.1109/WASPAA58266.2023.10248082].

Quaternion anti-transfer learning for speech emotion recognition

Guizzo E.
Primo
;
Comminiello D.
Ultimo
2023

Abstract

This study explores the benefits of anti-transfer learning with quaternion neural networks for robust, effective, and efficient speech emotion recognition. Anti-transfer learning selectively promotes task invariance through the introduction of a deep feature loss at training time. It has been shown to improve the performance of speech emotion recognition models by encouraging the independence of emotion predictions from specific uttered words and characteristics of the speaker's voice. However, the improved accuracy comes at a cost of increased computation time and memory requirements. In order to reduce the resource demand of anti-transfer, we propose to exploit quaternion-valued processing. We design, implement, and evaluate the use of quaternion anti-transfer learning on the basis of the VGG16 architecture and quaternion embeddings on multiple datasets for different speech emotion recognition task setups. The effectiveness of this approach depends on the layer where it is applied, with early layers offering a good compromise between performance gain and resource requirements. Our results show that anti-transfer in the quaternion domain can enhance generalisation while reducing the model's demand for computation and memory.
2023
2023 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, WASPAA 2023
anti-transfer learning; quaternion neural networks; speech emotion recognition
04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
Quaternion anti-transfer learning for speech emotion recognition / Guizzo, E.; Weyde, T.; Tarroni, G.; Comminiello, D.. - (2023), pp. 1-5. (Intervento presentato al convegno 2023 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, WASPAA 2023 tenutosi a New Paltz, NY; USA) [10.1109/WASPAA58266.2023.10248082].
File allegati a questo prodotto
File Dimensione Formato  
Guizzo_Quaternion Anti-Transfer_2023.pdf

solo gestori archivio

Note: Articolo
Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 959.46 kB
Formato Adobe PDF
959.46 kB Adobe PDF   Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1693482
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact