In this paper, an Automatic Music Transcription (AMT) algorithm based on a supervised Non-negatve Matrix Decomposition (NMD) is discussed. In particular, a novel approach for enhancing the sparsity of the solution is proposed. It consists of a two-step processing in which the NMD is solved joining a `2 regularization and a threshold filtering. In the first step, the NMD is performed with the `2 regularization in order to get an overall selection of the notes most likely appearing in the monotimbral musical excerpt. In the second step, a threshold filtering followed by another `2 regularized NMD are repeatedly performed in order to progressively reduce the dictionary matrix and to refine the notes transcription. Furthermore, a useroriented instrument learning procedure has been conceived and proposed. The proposed AMT system has been tested upon the dataset collected by the LabROSA laboratories considering the transcription of three different pianos. Moreover, it has been validated through a comparison with a regularized NMD and with three open source AMT software. The results prove the effectiveness of the proposed two-step processing in enhancing the sparsity of the solution and in improving the transcription accuracy. Moreover, the proposed system shows promising performance in both multi-F0 and note tracking tasks, obtaining in most tests better transcription accuracy than the competing algorithms.

Instrument learning and sparse NMD for automatic polyphonic music transcription / Rizzi, Antonello; Antonelli, Mario; Luzi, Massimiliano. - In: IEEE TRANSACTIONS ON MULTIMEDIA. - ISSN 1520-9210. - STAMPA. - 19:7(2017), pp. 1405-1415. [10.1109/TMM.2017.2674603]

Instrument learning and sparse NMD for automatic polyphonic music transcription

RIZZI, Antonello;LUZI, MASSIMILIANO
2017

Abstract

In this paper, an Automatic Music Transcription (AMT) algorithm based on a supervised Non-negatve Matrix Decomposition (NMD) is discussed. In particular, a novel approach for enhancing the sparsity of the solution is proposed. It consists of a two-step processing in which the NMD is solved joining a `2 regularization and a threshold filtering. In the first step, the NMD is performed with the `2 regularization in order to get an overall selection of the notes most likely appearing in the monotimbral musical excerpt. In the second step, a threshold filtering followed by another `2 regularized NMD are repeatedly performed in order to progressively reduce the dictionary matrix and to refine the notes transcription. Furthermore, a useroriented instrument learning procedure has been conceived and proposed. The proposed AMT system has been tested upon the dataset collected by the LabROSA laboratories considering the transcription of three different pianos. Moreover, it has been validated through a comparison with a regularized NMD and with three open source AMT software. The results prove the effectiveness of the proposed two-step processing in enhancing the sparsity of the solution and in improving the transcription accuracy. Moreover, the proposed system shows promising performance in both multi-F0 and note tracking tasks, obtaining in most tests better transcription accuracy than the competing algorithms.
2017
automatic music transcription; spectrogram factorization; non-negative matrix decomposition; sparse coding; non-monotone optimization
01 Pubblicazione su rivista::01a Articolo in rivista
Instrument learning and sparse NMD for automatic polyphonic music transcription / Rizzi, Antonello; Antonelli, Mario; Luzi, Massimiliano. - In: IEEE TRANSACTIONS ON MULTIMEDIA. - ISSN 1520-9210. - STAMPA. - 19:7(2017), pp. 1405-1415. [10.1109/TMM.2017.2674603]
File allegati a questo prodotto
File Dimensione Formato  
Rizzi_Instrument_2017.pdf

solo utenti autorizzati

Note: Instrument Learning and Sparse NMD for Automatic Polyphonic Music Transcription
Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 321.68 kB
Formato Adobe PDF
321.68 kB Adobe PDF   Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/958235
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 10
  • ???jsp.display-item.citation.isi??? 10
social impact