CoVal-SGAN: A Complex-Valued Spectral GAN architecture for the effective audio data augmentation in construction sites

Scarpiniti, M.; Mauri, C.; Comminiello, D.; Uncini, A.; Lee, Y. -C.

doi:10.1109/IJCNN55064.2022.9891915

Generative audio data augmentation in a construction site is one of challenging research areas due to the high dissimilarity between work sounds of involved machines and equipment. However, it becomes necessary since the availability of audio data of critical work classes is often rare. Motivated by these considerations and demands, in this paper, we propose a complex-valued GAN architecture working with the audio spectrogram, named CoVal-SGAN, for an effective augmentation of audio data. Specifically, the proposed CoVal-SGAN exploits both the magnitude and phase information to improve the quality of the artificially generated audio signals and increase the overall performance of the underlying classifier. Numerical results, performed on the data recorded in real-world construction sites, along with the comparisons with available state-of-the-art approaches, show the effectiveness of the proposed idea by obtaining an improved accuracy.

CoVal-SGAN: A Complex-Valued Spectral GAN architecture for the effective audio data augmentation in construction sites / Scarpiniti, M.; Mauri, C.; Comminiello, D.; Uncini, A.; Lee, Y. -C.. - 2022-July:(2022), pp. 1-8. (Intervento presentato al convegno 2022 International Joint Conference on Neural Networks (IJCNN 2022) tenutosi a Padua, Italy) [10.1109/IJCNN55064.2022.9891915].

CoVal-SGAN: A Complex-Valued Spectral GAN architecture for the effective audio data augmentation in construction sites

Scarpiniti M.;Mauri C.;Comminiello D.;Uncini A.;Lee Y. -C.

2022

Abstract

Generative audio data augmentation in a construction site is one of challenging research areas due to the high dissimilarity between work sounds of involved machines and equipment. However, it becomes necessary since the availability of audio data of critical work classes is often rare. Motivated by these considerations and demands, in this paper, we propose a complex-valued GAN architecture working with the audio spectrogram, named CoVal-SGAN, for an effective augmentation of audio data. Specifically, the proposed CoVal-SGAN exploits both the magnitude and phase information to improve the quality of the artificially generated audio signals and increase the overall performance of the underlying classifier. Numerical results, performed on the data recorded in real-world construction sites, along with the comparisons with available state-of-the-art approaches, show the effectiveness of the proposed idea by obtaining an improved accuracy.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2022
			
	Nome convegno
	
				2022 International Joint Conference on Neural Networks (IJCNN 2022)
			
	Parole chiave
	
				audio data augmentation; construction sites; complex-valued architectures; Generative Adversarial Networks; GAN
			
	Tipologia
	
				04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
			
	Citazione
	
				CoVal-SGAN: A Complex-Valued Spectral GAN architecture for the effective audio data augmentation in construction sites / Scarpiniti, M.; Mauri, C.; Comminiello, D.; Uncini, A.; Lee, Y. -C.. - 2022-July:(2022), pp. 1-8. (Intervento presentato al  convegno 2022 International Joint Conference on Neural Networks (IJCNN 2022) tenutosi a Padua, Italy) [10.1109/IJCNN55064.2022.9891915].
			
	Appartiene alla tipologia:
	
				04b Atto di convegno in volume

File allegati a questo prodotto

File	Dimensione	Formato
Scarpiniti_CoVal-SGAN_2022.pdf solo gestori archivio Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 943.62 kB Formato Adobe PDF Contatta l'autore	943.62 kB	Adobe PDF	Contatta l'autore
Scarpiniti_post-print_CoVal-SGAN_2022.pdf.pdf Open Access dal 02/10/2024 Note: post-print Tipologia: Documento in Post-print (versione successiva alla peer review e accettata per la pubblicazione) Licenza: Creative commons Dimensione 350.07 kB Formato Adobe PDF	350.07 kB	Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1658246

Citazioni

ND

2

1

Nome	Dominio	Durata	Descrizione
s_.*	plu.mx	sessione	recupero grafico citazioni sociali da plumx
A_.*	core.ac.uk	7 giorni	recupero pubblicazioni consigliate per il pannello core-recommander
GS_.*	gstatic.com	richiesta http	visualizza grafico citazioni
CC_.*	creativecommons.org	richiesta http	visualizza licenza bitstream

Catalogo dei prodotti della ricerca