Generative audio data augmentation in a construction site is one of challenging research areas due to the high dissimilarity between work sounds of involved machines and equipment. However, it becomes necessary since the availability of audio data of critical work classes is often rare. Motivated by these considerations and demands, in this paper, we propose a complex-valued GAN architecture working with the audio spectrogram, named CoVal-SGAN, for an effective augmentation of audio data. Specifically, the proposed CoVal-SGAN exploits both the magnitude and phase information to improve the quality of the artificially generated audio signals and increase the overall performance of the underlying classifier. Numerical results, performed on the data recorded in real-world construction sites, along with the comparisons with available state-of-the-art approaches, show the effectiveness of the proposed idea by obtaining an improved accuracy.

CoVal-SGAN: A Complex-Valued Spectral GAN architecture for the effective audio data augmentation in construction sites / Scarpiniti, M.; Mauri, C.; Comminiello, D.; Uncini, A.; Lee, Y. -C.. - 2022-July:(2022), pp. 1-8. (Intervento presentato al convegno 2022 International Joint Conference on Neural Networks (IJCNN 2022) tenutosi a Padua, Italy) [10.1109/IJCNN55064.2022.9891915].

CoVal-SGAN: A Complex-Valued Spectral GAN architecture for the effective audio data augmentation in construction sites

Scarpiniti M.
;
Mauri C.;Comminiello D.;Uncini A.;
2022

Abstract

Generative audio data augmentation in a construction site is one of challenging research areas due to the high dissimilarity between work sounds of involved machines and equipment. However, it becomes necessary since the availability of audio data of critical work classes is often rare. Motivated by these considerations and demands, in this paper, we propose a complex-valued GAN architecture working with the audio spectrogram, named CoVal-SGAN, for an effective augmentation of audio data. Specifically, the proposed CoVal-SGAN exploits both the magnitude and phase information to improve the quality of the artificially generated audio signals and increase the overall performance of the underlying classifier. Numerical results, performed on the data recorded in real-world construction sites, along with the comparisons with available state-of-the-art approaches, show the effectiveness of the proposed idea by obtaining an improved accuracy.
2022
2022 International Joint Conference on Neural Networks (IJCNN 2022)
audio data augmentation; construction sites; complex-valued architectures; Generative Adversarial Networks; GAN
04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
CoVal-SGAN: A Complex-Valued Spectral GAN architecture for the effective audio data augmentation in construction sites / Scarpiniti, M.; Mauri, C.; Comminiello, D.; Uncini, A.; Lee, Y. -C.. - 2022-July:(2022), pp. 1-8. (Intervento presentato al convegno 2022 International Joint Conference on Neural Networks (IJCNN 2022) tenutosi a Padua, Italy) [10.1109/IJCNN55064.2022.9891915].
File allegati a questo prodotto
File Dimensione Formato  
Scarpiniti_CoVal-SGAN_2022.pdf

solo gestori archivio

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 943.62 kB
Formato Adobe PDF
943.62 kB Adobe PDF   Contatta l'autore
Scarpiniti_post-print_CoVal-SGAN_2022.pdf.pdf

Open Access dal 02/10/2024

Note: post-print
Tipologia: Documento in Post-print (versione successiva alla peer review e accettata per la pubblicazione)
Licenza: Creative commons
Dimensione 350.07 kB
Formato Adobe PDF
350.07 kB Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1658246
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 0
social impact