Catalogo dei prodotti della ricerca

In this chapter, we describe the basic concepts behind the functioning of recurrent neural networks and explain the general properties that are common to several existing architectures. We introduce the basis of their training procedure, the backpropagation through time, as a general way to propagate and distribute the prediction error to previous states of the network. The learning procedure consists of updating the model parameters by minimizing a suitable loss function, which includes the error achieved on the target task and, usually, also one or more regularization terms. We then discuss several ways of regularizing the system, highlighting their advantages and drawbacks. Beside the standard stochastic gradient descent procedure, we also present several additional optimization strategies proposed in the literature for updating the network weights. Finally, we illustrate the problem of the vanishing gradient effect, an inherent problem of the gradient-based optimization techniques which occur in several situations while training neural networks. We conclude by discussing the most recent and successful approaches proposed in the literature to limit the vanishing of the gradients.

Properties and training in recurrent neural networks / Bianchi, Filippo Maria; Maiorino, Enrico; Kampffmeyer, Michael C.; Rizzi, Antonello; Jenssen, Robert. - STAMPA. - (2017), pp. 9-21. - SPRINGERBRIEFS IN COMPUTER SCIENCE. [10.1007/978-3-319-70338-1_2].

Properties and training in recurrent neural networks

Bianchi, Filippo Maria;Maiorino, Enrico;Kampffmeyer, Michael C.;Rizzi, Antonello;Jenssen, Robert

2017

Abstract

In this chapter, we describe the basic concepts behind the functioning of recurrent neural networks and explain the general properties that are common to several existing architectures. We introduce the basis of their training procedure, the backpropagation through time, as a general way to propagate and distribute the prediction error to previous states of the network. The learning procedure consists of updating the model parameters by minimizing a suitable loss function, which includes the error achieved on the target task and, usually, also one or more regularization terms. We then discuss several ways of regularizing the system, highlighting their advantages and drawbacks. Beside the standard stochastic gradient descent procedure, we also present several additional optimization strategies proposed in the literature for updating the network weights. Finally, we illustrate the problem of the vanishing gradient effect, an inherent problem of the gradient-based optimization techniques which occur in several situations while training neural networks. We conclude by discussing the most recent and successful approaches proposed in the literature to limit the vanishing of the gradients.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2017
			
	Titolo del volume
	
				Recurrent Neural Networks for Short-Term Load Forecasting. An Overview and Comparative Analysis
			
	ISBN
	
				978-3-319-70337-4
978-3-319-70338-1
			
	Parole chiave
	
				Backpropagation through time; Gradient descent; Learning procedures in neural networks; Loss function; Parameters training; Regularization techniques; Vanishing gradient; Computer Science (all)
			
	Tipologia
	
				02 Pubblicazione su volume::02a Capitolo, Articolo o Contributo
			
	Citazione
	
				Properties and training in recurrent neural networks / Bianchi, Filippo Maria; Maiorino, Enrico; Kampffmeyer, Michael C.; Rizzi, Antonello; Jenssen, Robert. - STAMPA. - (2017), pp. 9-21. - SPRINGERBRIEFS IN COMPUTER SCIENCE. [10.1007/978-3-319-70338-1_2].
			
	Appartiene alla tipologia:
	
				02a Capitolo o Articolo

File allegati a questo prodotto

File	Dimensione	Formato
Bianchi_Properties_2017.pdf solo gestori archivio Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 249.55 kB Formato Adobe PDF Contatta l'autore	249.55 kB	Adobe PDF	Contatta l'autore
Bianchi_Recurrent_Frontespizio-colophon-indice_2017.pdf solo gestori archivio Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 1.48 MB Formato Adobe PDF Contatta l'autore	1.48 MB	Adobe PDF	Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1118566

Citazioni

ND

7

ND

social impact