PHYDI. Initializing parameterized hypercomplex neural networks as identity functions

Mancanelli, M.; Grassucci, E.; Uncini, A.; Comminiello, D.

doi:10.1109/MLSP55844.2023.10285926

Neural models based on hypercomplex algebra systems are growing and prolificating for a plethora of applications, ranging from computer vision to natural language processing. Hand in hand with their adoption, parameterized hypercomplex neural networks (PHNNs) are growing in size and no techniques have been adopted so far to control their convergence at a large scale. In this paper, we study PHNNs convergence and propose parameterized hypercomplex identity initialization (PHYDI), a method to improve their convergence at different scales, leading to more robust performance when the number of layers scales up, while also reaching the same performance with fewer iterations. We show the effectiveness of this approach in different benchmarks and with common PHNNs with ResNets- and Transformer-based architecture. The code is available at https://github.com/ispamm/PHYDI.

PHYDI. Initializing parameterized hypercomplex neural networks as identity functions / Mancanelli, M.; Grassucci, E.; Uncini, A.; Comminiello, D.. - (2023), pp. 1-6. (Intervento presentato al convegno 33rd IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2023 tenutosi a Rome; Italy) [10.1109/MLSP55844.2023.10285926].

PHYDI. Initializing parameterized hypercomplex neural networks as identity functions

Mancanelli M.;Grassucci E.^Co-primo;Uncini A.;Comminiello D.^Ultimo

2023

Abstract

Neural models based on hypercomplex algebra systems are growing and prolificating for a plethora of applications, ranging from computer vision to natural language processing. Hand in hand with their adoption, parameterized hypercomplex neural networks (PHNNs) are growing in size and no techniques have been adopted so far to control their convergence at a large scale. In this paper, we study PHNNs convergence and propose parameterized hypercomplex identity initialization (PHYDI), a method to improve their convergence at different scales, leading to more robust performance when the number of layers scales up, while also reaching the same performance with fewer iterations. We show the effectiveness of this approach in different benchmarks and with common PHNNs with ResNets- and Transformer-based architecture. The code is available at https://github.com/ispamm/PHYDI.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2023
			
	Nome convegno
	
				33rd IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2023
			
	Parole chiave
	
				hypercomplex algebra; hypercomplex neural networks; identity initialization; neural networks convergence; residual connections
			
	Tipologia
	
				04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
			
	Citazione
	
				PHYDI. Initializing parameterized hypercomplex neural networks as identity functions / Mancanelli, M.; Grassucci, E.; Uncini, A.; Comminiello, D.. - (2023), pp. 1-6. (Intervento presentato al  convegno 33rd IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2023 tenutosi a Rome; Italy) [10.1109/MLSP55844.2023.10285926].
			
	Appartiene alla tipologia:
	
				04b Atto di convegno in volume

File allegati a questo prodotto

File	Dimensione	Formato
Mancanelli_PHYDI_ 2023.pdf accesso aperto Note: Articolo Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 295.9 kB Formato Adobe PDF	295.9 kB	Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1693477

Citazioni

ND

1

ND

Catalogo dei prodotti della ricerca