Identifying instruments in polyphonic audio is challenging due to overlapping spectra and variations in timbre and playing styles. This task is central to music information retrieval, with applications in transcription, recommendation, and indexing. We propose a dual-branch Convolutional Neural Network (CNN) that processes Mel-spectrograms and binary pitch masks, fused through a cross-attention mechanism to emphasize pitch-salient regions. On the IRMAS dataset, the model achieves competitive performance with state-of-the-art methods, reaching a micro F1 of 0.64 and a macro F1 of 0.57 with only 0.878M parameters. Ablation studies and t-SNE analyses further highlight the benefits of cross-modal attention for robust predominant instrument recognition.
Cross-Attentive CNNs for Joint Specral and Pitch Feature Learning in Predominant Instrument Recognition from Polyphonic Music / Reghunath, L. C.; Rajan, R.; Napoli, C.; Randieri, C.. - In: TECHNOLOGIES. - ISSN 2227-7080. - 14:1(2026). [10.3390/technologies14010003]
Cross-Attentive CNNs for Joint Specral and Pitch Feature Learning in Predominant Instrument Recognition from Polyphonic Music
Napoli C.
;Randieri C.
2026
Abstract
Identifying instruments in polyphonic audio is challenging due to overlapping spectra and variations in timbre and playing styles. This task is central to music information retrieval, with applications in transcription, recommendation, and indexing. We propose a dual-branch Convolutional Neural Network (CNN) that processes Mel-spectrograms and binary pitch masks, fused through a cross-attention mechanism to emphasize pitch-salient regions. On the IRMAS dataset, the model achieves competitive performance with state-of-the-art methods, reaching a micro F1 of 0.64 and a macro F1 of 0.57 with only 0.878M parameters. Ablation studies and t-SNE analyses further highlight the benefits of cross-modal attention for robust predominant instrument recognition.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


