Catalogo dei prodotti della ricerca

In the context of deep learning models, attention has recently been paid to studying the surface of the loss function in order to better understand training with methods based on gradient descent. This search for an appropriate description, both analytical and topological, has led to numerous efforts in identifying spurious minima and characterize gradient dynamics. Our work aims to contribute to this field by providing a topological measure for evaluating loss complexity in the case of multilayer neural networks. We compare deep and shallow architectures with common sigmoidal activation functions by deriving upper and lower bounds for the complexity of their respective loss functions and revealing how that complexity is influenced by the number of hidden units, training models, and the activation function used. Additionally, we found that certain variations in the loss function or model architecture, such as adding an l(2) regularization term or implementing skip connections in a feedforward network, do not affect loss topology in specific cases.

A topological description of loss surfaces based on Betti Numbers / Bucarelli, Maria Sofia; D'Inverno, Giuseppe Alessio; Bianchini, Monica; Scarselli, Franco; Silvestri, Fabrizio. - In: NEURAL NETWORKS. - ISSN 0893-6080. - 178:(2024). [10.1016/j.neunet.2024.106465]

A topological description of loss surfaces based on Betti Numbers

Bucarelli, Maria Sofia;D'Inverno, Giuseppe Alessio;Bianchini, Monica;Scarselli, Franco;Silvestri, Fabrizio

2024

Abstract

In the context of deep learning models, attention has recently been paid to studying the surface of the loss function in order to better understand training with methods based on gradient descent. This search for an appropriate description, both analytical and topological, has led to numerous efforts in identifying spurious minima and characterize gradient dynamics. Our work aims to contribute to this field by providing a topological measure for evaluating loss complexity in the case of multilayer neural networks. We compare deep and shallow architectures with common sigmoidal activation functions by deriving upper and lower bounds for the complexity of their respective loss functions and revealing how that complexity is influenced by the number of hidden units, training models, and the activation function used. Additionally, we found that certain variations in the loss function or model architecture, such as adding an l(2) regularization term or implementing skip connections in a feedforward network, do not affect loss topology in specific cases.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2024
			
	Parole chiave
	
				betti numbers; Loss surface; resNet; ℓ(2) regularization
			
	Tipologia
	
				01 Pubblicazione su rivista::01a Articolo in rivista
			
	Citazione
	
				A topological description of loss surfaces based on Betti Numbers / Bucarelli, Maria Sofia; D'Inverno, Giuseppe Alessio; Bianchini, Monica; Scarselli, Franco; Silvestri, Fabrizio. - In: NEURAL NETWORKS. - ISSN 0893-6080. - 178:(2024). [10.1016/j.neunet.2024.106465]
			
	Appartiene alla tipologia:
	
				01a Articolo in rivista

File allegati a questo prodotto

File	Dimensione	Formato
Bucarelli_Topological_2024.pdf accesso aperto Note: https://doi.org/10.1016/j.neunet.2024.106465 Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore) Licenza: Creative commons Dimensione 622.57 kB Formato Adobe PDF	622.57 kB	Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1717634

Citazioni

0

0

0

social impact