Deep neural networks are generally designed as a stack of differentiable layers, in which a prediction is obtained only after running the full stack. Recently, some contributions have proposed techniques to endow the networks with early exits, allowing to obtain predictions at intermediate points of the stack. These multi-output networks have a number of advantages, including (i) significant reductions of the inference time, (ii) reduced tendency to overfitting and vanishing gradients, and (iii) capability of being distributed over multi-tier computation platforms. In addition, they connect to the wider themes of biological plausibility and layered cognitive reasoning. In this paper, we provide a comprehensive introduction to this family of neural networks, by describing in a unified fashion the way these architectures can be designed, trained, and actually deployed in time-constrained scenarios. We also describe in-depth their application scenarios in 5G and Fog computing environments, as long as some of the open research questions connected to them.
Why should we add early exits to neural networks? / Scardapane, S.; Scarpiniti, M.; Baccarelli, E.; Uncini, A.. - In: COGNITIVE COMPUTATION. - ISSN 1866-9956. - 12:5(2020), pp. 954-966. [10.1007/s12559-020-09734-4]
Why should we add early exits to neural networks?
Scardapane S.
;Scarpiniti M.;Baccarelli E.;Uncini A.
2020
Abstract
Deep neural networks are generally designed as a stack of differentiable layers, in which a prediction is obtained only after running the full stack. Recently, some contributions have proposed techniques to endow the networks with early exits, allowing to obtain predictions at intermediate points of the stack. These multi-output networks have a number of advantages, including (i) significant reductions of the inference time, (ii) reduced tendency to overfitting and vanishing gradients, and (iii) capability of being distributed over multi-tier computation platforms. In addition, they connect to the wider themes of biological plausibility and layered cognitive reasoning. In this paper, we provide a comprehensive introduction to this family of neural networks, by describing in a unified fashion the way these architectures can be designed, trained, and actually deployed in time-constrained scenarios. We also describe in-depth their application scenarios in 5G and Fog computing environments, as long as some of the open research questions connected to them.File | Dimensione | Formato | |
---|---|---|---|
Scardapane_Why-should_2020.pdf
solo gestori archivio
Tipologia:
Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
697.52 kB
Formato
Adobe PDF
|
697.52 kB | Adobe PDF | Contatta l'autore |
Scardapane_preprint_Why_2020.pdf
solo gestori archivio
Tipologia:
Documento in Pre-print (manoscritto inviato all'editore, precedente alla peer review)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
636.21 kB
Formato
Adobe PDF
|
636.21 kB | Adobe PDF | Contatta l'autore |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.