Catalogo dei prodotti della ricerca

The paper aims to investigate the impact of the optimization algorithms on the training of deep neural networks with an eye to the interaction between the optimizer and the generalization performance. In particular, we aim to analyze the behavior of state-of-the-art optimization algorithms in relationship to their hyperparameters setting to detect robustness with respect to the choice of a certain starting point in ending on different local solutions. We conduct extensive computational experiments using nine open-source optimization algorithms to train deep Convolutional Neural Network architectures on an image multi-class classification task. Precisely, we consider several architectures by changing the number of layers and neurons per layer, to evaluate the impact of different width and depth structures on the computational optimization performance. We show that the optimizers often return different local solutions and highlight the strong correlation between the quality of the solution found and the generalization capability of the trained network. We also discuss the role of hyperparameters tuning and show how a tuned hyperparameters setting can be re-used for the same task on different problems achieving better efficiency and generalization performance than a default setting.

Tuning parameters of deep neural network training algorithms pays off: a computational study / Coppola, Corrado; Papa, Lorenzo; Boresta, Marco; Amerini, Irene; Palagi, Laura. - In: TOP. - ISSN 1134-5764. - (2024). [10.1007/s11750-024-00683-x]

Tuning parameters of deep neural network training algorithms pays off: a computational study

Coppola, Corrado;Papa, Lorenzo;Boresta, Marco;Amerini, Irene;Palagi, Laura

2024

Abstract

The paper aims to investigate the impact of the optimization algorithms on the training of deep neural networks with an eye to the interaction between the optimizer and the generalization performance. In particular, we aim to analyze the behavior of state-of-the-art optimization algorithms in relationship to their hyperparameters setting to detect robustness with respect to the choice of a certain starting point in ending on different local solutions. We conduct extensive computational experiments using nine open-source optimization algorithms to train deep Convolutional Neural Network architectures on an image multi-class classification task. Precisely, we consider several architectures by changing the number of layers and neurons per layer, to evaluate the impact of different width and depth structures on the computational optimization performance. We show that the optimizers often return different local solutions and highlight the strong correlation between the quality of the solution found and the generalization capability of the trained network. We also discuss the role of hyperparameters tuning and show how a tuned hyperparameters setting can be re-used for the same task on different problems achieving better efficiency and generalization performance than a default setting.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2024
			
	Parole chiave
	
				machine learning; large-scale optimization; deep network
			
	Tipologia
	
				01 Pubblicazione su rivista::01a Articolo in rivista
			
	Citazione
	
				Tuning parameters of deep neural network training algorithms pays off: a computational study / Coppola, Corrado; Papa, Lorenzo; Boresta, Marco; Amerini, Irene; Palagi, Laura. - In: TOP. - ISSN 1134-5764. - (2024). [10.1007/s11750-024-00683-x]
			
	Appartiene alla tipologia:
	
				01a Articolo in rivista

File allegati a questo prodotto

File	Dimensione	Formato
Coppola_Tuning_2024.pdf accesso aperto Note: DOI https://doi.org/10.1007/s11750-024-00683-x Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore) Licenza: Creative commons Dimensione 2.7 MB Formato Adobe PDF	2.7 MB	Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1719602

Citazioni

ND

ND

1

social impact