In this chapter, we describe the basic concepts behind the functioning of recurrent neural networks and explain the general properties that are common to several existing architectures. We introduce the basis of their training procedure, the backpropagation through time, as a general way to propagate and distribute the prediction error to previous states of the network. The learning procedure consists of updating the model parameters by minimizing a suitable loss function, which includes the error achieved on the target task and, usually, also one or more regularization terms. We then discuss several ways of regularizing the system, highlighting their advantages and drawbacks. Beside the standard stochastic gradient descent procedure, we also present several additional optimization strategies proposed in the literature for updating the network weights. Finally, we illustrate the problem of the vanishing gradient effect, an inherent problem of the gradient-based optimization techniques which occur in several situations while training neural networks. We conclude by discussing the most recent and successful approaches proposed in the literature to limit the vanishing of the gradients.

Properties and training in recurrent neural networks / Bianchi, Filippo Maria; Maiorino, Enrico; Kampffmeyer, Michael C.; Rizzi, Antonello; Jenssen, Robert. - STAMPA. - (2017), pp. 9-21. - SPRINGERBRIEFS IN COMPUTER SCIENCE. [10.1007/978-3-319-70338-1_2].

Properties and training in recurrent neural networks

Bianchi, Filippo Maria;Maiorino, Enrico;Rizzi, Antonello;
2017

Abstract

In this chapter, we describe the basic concepts behind the functioning of recurrent neural networks and explain the general properties that are common to several existing architectures. We introduce the basis of their training procedure, the backpropagation through time, as a general way to propagate and distribute the prediction error to previous states of the network. The learning procedure consists of updating the model parameters by minimizing a suitable loss function, which includes the error achieved on the target task and, usually, also one or more regularization terms. We then discuss several ways of regularizing the system, highlighting their advantages and drawbacks. Beside the standard stochastic gradient descent procedure, we also present several additional optimization strategies proposed in the literature for updating the network weights. Finally, we illustrate the problem of the vanishing gradient effect, an inherent problem of the gradient-based optimization techniques which occur in several situations while training neural networks. We conclude by discussing the most recent and successful approaches proposed in the literature to limit the vanishing of the gradients.
2017
Recurrent Neural Networks for Short-Term Load Forecasting. An Overview and Comparative Analysis
978-3-319-70337-4
978-3-319-70338-1
Backpropagation through time; Gradient descent; Learning procedures in neural networks; Loss function; Parameters training; Regularization techniques; Vanishing gradient; Computer Science (all)
02 Pubblicazione su volume::02a Capitolo, Articolo o Contributo
Properties and training in recurrent neural networks / Bianchi, Filippo Maria; Maiorino, Enrico; Kampffmeyer, Michael C.; Rizzi, Antonello; Jenssen, Robert. - STAMPA. - (2017), pp. 9-21. - SPRINGERBRIEFS IN COMPUTER SCIENCE. [10.1007/978-3-319-70338-1_2].
File allegati a questo prodotto
File Dimensione Formato  
Bianchi_Properties_2017.pdf

solo gestori archivio

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 249.55 kB
Formato Adobe PDF
249.55 kB Adobe PDF   Contatta l'autore
Bianchi_Recurrent_Frontespizio-colophon-indice_2017.pdf

solo gestori archivio

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 1.48 MB
Formato Adobe PDF
1.48 MB Adobe PDF   Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1118566
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 7
  • ???jsp.display-item.citation.isi??? ND
social impact