Catalogo dei prodotti della ricerca

Automatic Music Transcription (AMT) is a complex matter involving several researchers. Recently, due to the powerful Deep Learning techniques, many effective solutions have been proposed. However, there is still room for improvement. To this purpose, in this paper, we propose an architecture based on two U-Net models exploiting Convolutional Neural Networks (CNNs) and a Bidirectional Long-Short Term Memory (BiLSTM) unit, aiming at improving the wave to MIDI transcription performance. This couple of U-Nets act as onset and offset detectors, respectively, whose information are jointly used along with the input mel spectrogram into a third model to find all the active notes in each time-frame. Some numerical results, obtained on the well known MAPS dataset, show the effectiveness of the proposed idea and the advantages over similar state-of-the-art approaches.

A U-Net Based Architecture for Automatic Music Transcription / Scarpiniti, Michele; Sigismondi, Edoardo; Comminiello, Danilo; Uncini, Aurelio. - (2023), pp. 1-6. (Intervento presentato al convegno 2023 IEEE 33rd International Workshop on Machine Learning for Signal Processing (MLSP 2023) tenutosi a Rome; Italy) [10.1109/MLSP55844.2023.10285985].

A U-Net Based Architecture for Automatic Music Transcription

Scarpiniti, Michele;Sigismondi, Edoardo;Comminiello, Danilo;Uncini, Aurelio

2023

Abstract

Automatic Music Transcription (AMT) is a complex matter involving several researchers. Recently, due to the powerful Deep Learning techniques, many effective solutions have been proposed. However, there is still room for improvement. To this purpose, in this paper, we propose an architecture based on two U-Net models exploiting Convolutional Neural Networks (CNNs) and a Bidirectional Long-Short Term Memory (BiLSTM) unit, aiming at improving the wave to MIDI transcription performance. This couple of U-Nets act as onset and offset detectors, respectively, whose information are jointly used along with the input mel spectrogram into a third model to find all the active notes in each time-frame. Some numerical results, obtained on the well known MAPS dataset, show the effectiveness of the proposed idea and the advantages over similar state-of-the-art approaches.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2023
			
	Nome convegno
	
				2023 IEEE 33rd International Workshop on Machine Learning for Signal Processing (MLSP 2023)
			
	Parole chiave
	
				Automatic Music Transcription (AMT); wave to MIDI; deep learning; U-Net; Convolutional Neural Network (CNN)
			
	Tipologia
	
				04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
			
	Citazione
	
				A U-Net Based Architecture for Automatic Music Transcription / Scarpiniti, Michele; Sigismondi, Edoardo; Comminiello, Danilo; Uncini, Aurelio. - (2023), pp. 1-6. (Intervento presentato al  convegno 2023 IEEE 33rd International Workshop on Machine Learning for Signal Processing (MLSP 2023) tenutosi a Rome; Italy) [10.1109/MLSP55844.2023.10285985].
			
	Appartiene alla tipologia:
	
				04b Atto di convegno in volume

File allegati a questo prodotto

File	Dimensione	Formato
Scarpiniti_A-U-Net_2023.pdf solo gestori archivio Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 397.8 kB Formato Adobe PDF Contatta l'autore	397.8 kB	Adobe PDF	Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1692643

Citazioni

ND

0

ND

social impact