The L3DAS21 Challenge11www.13das.com/mlsp2021 is aimed at encouraging and fostering collaborative research on machine learning for 3D audio signal processing, with particular focus on 3D speech enhancement (SE) and 3D sound localization and detection (SELD). Alongside with the challenge, we release the L3DAS21 dataset, a 65 hours 3D audio corpus, accompanied with a Python API that facilitates the data usage and results submission stage. Usually, machine learning approaches to 3D audio tasks are based on single-perspective Ambisonics recordings or on arrays of single-capsule microphones. We propose, instead, a novel multichannel audio configuration based multiple-source and multiple-perspective Ambisonics recordings, performed with an array of two first-order Ambisonics microphones. To the best of our knowledge, it is the first time that a dualmic Ambisonics configuration is used for these tasks. We provide baseline models and results for both tasks, obtained with state-of-The-Art architectures: FaSNet for SE and SELDnet for SELD.

L3DAS21 challenge: machine learning for 3D audio signal processing / Guizzo, Eric; Gramaccioni Riccardo, Fosco; Jamili, Saeid; Marinoni, Christian; Massaro, Edoardo; Medaglia, Claudia; Nachira, Giuseppe; Nucciarelli, Leonardo; Paglialunga, Ludovica; Pennese, Marco; Pepe, Sveva; Rocchi, Enrico; Uncini, Aurelio; Comminiello, Danilo. - 2021:(2021), pp. 1-6. (Intervento presentato al convegno 31st IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2021 tenutosi a Gold Coast; Australia) [10.1109/MLSP52302.2021.9596248].

L3DAS21 challenge: machine learning for 3D audio signal processing

Guizzo Eric;Gramaccioni Riccardo Fosco;Marinoni Christian;Uncini Aurelio;Comminiello Danilo
2021

Abstract

The L3DAS21 Challenge11www.13das.com/mlsp2021 is aimed at encouraging and fostering collaborative research on machine learning for 3D audio signal processing, with particular focus on 3D speech enhancement (SE) and 3D sound localization and detection (SELD). Alongside with the challenge, we release the L3DAS21 dataset, a 65 hours 3D audio corpus, accompanied with a Python API that facilitates the data usage and results submission stage. Usually, machine learning approaches to 3D audio tasks are based on single-perspective Ambisonics recordings or on arrays of single-capsule microphones. We propose, instead, a novel multichannel audio configuration based multiple-source and multiple-perspective Ambisonics recordings, performed with an array of two first-order Ambisonics microphones. To the best of our knowledge, it is the first time that a dualmic Ambisonics configuration is used for these tasks. We provide baseline models and results for both tasks, obtained with state-of-The-Art architectures: FaSNet for SE and SELDnet for SELD.
2021
31st IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2021
3D audio; ambisonics; data challenge; sound source classification; sound source localization; speech enhancement
04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
L3DAS21 challenge: machine learning for 3D audio signal processing / Guizzo, Eric; Gramaccioni Riccardo, Fosco; Jamili, Saeid; Marinoni, Christian; Massaro, Edoardo; Medaglia, Claudia; Nachira, Giuseppe; Nucciarelli, Leonardo; Paglialunga, Ludovica; Pennese, Marco; Pepe, Sveva; Rocchi, Enrico; Uncini, Aurelio; Comminiello, Danilo. - 2021:(2021), pp. 1-6. (Intervento presentato al convegno 31st IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2021 tenutosi a Gold Coast; Australia) [10.1109/MLSP52302.2021.9596248].
File allegati a questo prodotto
File Dimensione Formato  
Guizzo_L3DAS21-challenge_ 2021.pdf

accesso aperto

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 214.92 kB
Formato Adobe PDF
214.92 kB Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1606332
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 15
  • ???jsp.display-item.citation.isi??? 9
social impact