Catalogo dei prodotti della ricerca

Spatial audio methods are gaining a growing interest due to the spread of immersive audio experiences and applications, such as virtual and augmented reality. For these purposes, 3D audio signals are often acquired through arrays of Ambisonics microphones, each comprising four capsules that decompose the sound field in spherical harmonics. In this paper, we propose a dual quaternion representation of the spatial sound field acquired through an array of two First Order Ambisonics (FOA) microphones. The audio signals are encapsulated in a dual quaternion that leverages quaternion algebra properties to exploit correlations among them. This augmented representation with 6 degrees of freedom (6DOF) involves a more accurate coverage of the sound field, resulting in a more precise sound localization and a more immersive audio experience. We evaluate our approach on a sound event localization and detection (SELD) benchmark. We show that our dual quaternion SELD model with temporal convolution blocks (DualQSELD-TCN) achieves better results with respect to real and quaternion-valued baselines thanks to our augmented representation of the sound field. Full code is available at: https://github.com/ispamm/DualQSELD-TCN.

Dual quaternion ambisonics array for six-degree-of-freedom acoustic representation / Grassucci, E.; Mancini, G.; Brignone, C.; Uncini, A.; Comminiello, D.. - In: PATTERN RECOGNITION LETTERS. - ISSN 0167-8655. - 166:(2023), pp. 24-30. [10.1016/j.patrec.2022.12.006]

Dual quaternion ambisonics array for six-degree-of-freedom acoustic representation

Grassucci E.;Mancini G.;Brignone C.;Uncini A.;Comminiello D.

2023

Abstract

Spatial audio methods are gaining a growing interest due to the spread of immersive audio experiences and applications, such as virtual and augmented reality. For these purposes, 3D audio signals are often acquired through arrays of Ambisonics microphones, each comprising four capsules that decompose the sound field in spherical harmonics. In this paper, we propose a dual quaternion representation of the spatial sound field acquired through an array of two First Order Ambisonics (FOA) microphones. The audio signals are encapsulated in a dual quaternion that leverages quaternion algebra properties to exploit correlations among them. This augmented representation with 6 degrees of freedom (6DOF) involves a more accurate coverage of the sound field, resulting in a more precise sound localization and a more immersive audio experience. We evaluate our approach on a sound event localization and detection (SELD) benchmark. We show that our dual quaternion SELD model with temporal convolution blocks (DualQSELD-TCN) achieves better results with respect to real and quaternion-valued baselines thanks to our augmented representation of the sound field. Full code is available at: https://github.com/ispamm/DualQSELD-TCN.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2023
			
	Parole chiave
	
				dual quaternion neural networks; dual quaternions; quaternion ambisonics signals; quaternion neural networks
			
	Tipologia
	
				01 Pubblicazione su rivista::01a Articolo in rivista
			
	Citazione
	
				Dual quaternion ambisonics array for six-degree-of-freedom acoustic representation / Grassucci, E.; Mancini, G.; Brignone, C.; Uncini, A.; Comminiello, D.. - In: PATTERN RECOGNITION LETTERS. - ISSN 0167-8655. - 166:(2023), pp. 24-30. [10.1016/j.patrec.2022.12.006]
			
	Appartiene alla tipologia:
	
				01a Articolo in rivista

File allegati a questo prodotto

File	Dimensione	Formato
Grassucci_preprint_Dual-quaternion_2023.pdf.pdf Open Access dal 02/03/2025 Tipologia: Documento in Pre-print (manoscritto inviato all'editore, precedente alla peer review) Licenza: Creative commons Dimensione 647.85 kB Formato Adobe PDF	647.85 kB	Adobe PDF
Grassucci_Dual-quaternion_2023.pdf solo gestori archivio Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 850.2 kB Formato Adobe PDF Contatta l'autore	850.2 kB	Adobe PDF	Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1669169

Citazioni

ND

13

12

social impact