Automatic Music Transcription (AMT) is a complex matter involving several researchers. Recently, due to the powerful Deep Learning techniques, many effective solutions have been proposed. However, there is still room for improvement. To this purpose, in this paper, we propose an architecture based on two U-Net models exploiting Convolutional Neural Networks (CNNs) and a Bidirectional Long-Short Term Memory (BiLSTM) unit, aiming at improving the wave to MIDI transcription performance. This couple of U-Nets act as onset and offset detectors, respectively, whose information are jointly used along with the input mel spectrogram into a third model to find all the active notes in each time-frame. Some numerical results, obtained on the well known MAPS dataset, show the effectiveness of the proposed idea and the advantages over similar state-of-the-art approaches.
A U-Net Based Architecture for Automatic Music Transcription / Scarpiniti, Michele; Sigismondi, Edoardo; Comminiello, Danilo; Uncini, Aurelio. - (2023), pp. 1-6. (Intervento presentato al convegno 2023 IEEE 33rd International Workshop on Machine Learning for Signal Processing (MLSP 2023) tenutosi a Rome; Italy) [10.1109/MLSP55844.2023.10285985].
A U-Net Based Architecture for Automatic Music Transcription
Scarpiniti, Michele
;Comminiello, Danilo;Uncini, Aurelio
2023
Abstract
Automatic Music Transcription (AMT) is a complex matter involving several researchers. Recently, due to the powerful Deep Learning techniques, many effective solutions have been proposed. However, there is still room for improvement. To this purpose, in this paper, we propose an architecture based on two U-Net models exploiting Convolutional Neural Networks (CNNs) and a Bidirectional Long-Short Term Memory (BiLSTM) unit, aiming at improving the wave to MIDI transcription performance. This couple of U-Nets act as onset and offset detectors, respectively, whose information are jointly used along with the input mel spectrogram into a third model to find all the active notes in each time-frame. Some numerical results, obtained on the well known MAPS dataset, show the effectiveness of the proposed idea and the advantages over similar state-of-the-art approaches.File | Dimensione | Formato | |
---|---|---|---|
Scarpiniti_A-U-Net_2023.pdf
solo gestori archivio
Tipologia:
Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
397.8 kB
Formato
Adobe PDF
|
397.8 kB | Adobe PDF | Contatta l'autore |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.