Catalogo dei prodotti della ricerca

The use of mini-batches of data in training artificial neural networks is nowadays very common. Despite its broad usage, theories explaining quantitatively how large or small the optimal mini-batch size should be are missing. This work presents a systematic attempt at understanding the role of the mini-batch size in training two-layer neural networks. Working in the teacher-student scenario, with a sparse teacher, and focusing on tasks of different complexity, we quantify the effects of changing the mini-batch size m. We find that often the generalization performances of the student strongly depend on m and may undergo sharp phase transitions at a critical value mc , such that for mmc the student learns perfectly or generalizes very well the teacher. Phase transitions are induced by collective phenomena firstly discovered in statistical mechanics and later observed in many fields of science. Observing a phase transition by varying the mini-batch size across different architectures raises several questions about the role of this hyperparameter in the neural network learning process.

Phase transitions in the mini-batch size for sparse and dense two-layer neural networks / Marino, Raffaele; Ricci-Tersenghi, Federico. - In: MACHINE LEARNING: SCIENCE AND TECHNOLOGY. - ISSN 2632-2153. - 5:1(2024), pp. 1-15. [10.1088/2632-2153/ad1de6]

Phase transitions in the mini-batch size for sparse and dense two-layer neural networks

Marino, Raffaele;Ricci-Tersenghi, Federico

2024

Abstract

The use of mini-batches of data in training artificial neural networks is nowadays very common. Despite its broad usage, theories explaining quantitatively how large or small the optimal mini-batch size should be are missing. This work presents a systematic attempt at understanding the role of the mini-batch size in training two-layer neural networks. Working in the teacher-student scenario, with a sparse teacher, and focusing on tasks of different complexity, we quantify the effects of changing the mini-batch size m. We find that often the generalization performances of the student strongly depend on m and may undergo sharp phase transitions at a critical value mc , such that for mmc the student learns perfectly or generalizes very well the teacher. Phase transitions are induced by collective phenomena firstly discovered in statistical mechanics and later observed in many fields of science. Observing a phase transition by varying the mini-batch size across different architectures raises several questions about the role of this hyperparameter in the neural network learning process.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2024
			
	Parole chiave
	
				phase transitions; mini-batch size; neural networks; statistical mechanics
			
	Tipologia
	
				01 Pubblicazione su rivista::01a Articolo in rivista
			
	Citazione
	
				Phase transitions in the mini-batch size for sparse and dense two-layer neural networks / Marino, Raffaele; Ricci-Tersenghi, Federico. - In: MACHINE LEARNING: SCIENCE AND TECHNOLOGY. - ISSN 2632-2153. - 5:1(2024), pp. 1-15. [10.1088/2632-2153/ad1de6]
			
	Appartiene alla tipologia:
	
				01a Articolo in rivista

File allegati a questo prodotto

File	Dimensione	Formato
Marino_Phase-transitions_2024.pdf accesso aperto Note: Articolo su rivista Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore) Licenza: Creative commons Dimensione 1.39 MB Formato Adobe PDF	1.39 MB	Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1706215

Citazioni

ND

4

3

social impact