Conditional computation in neural networks: Principles and research trends

Scardapane, Simone; Baiocchi, Alessandro; Devoto, Alessio; Marsocci, Valerio; Minervini, Pasquale; Pomponi, Jary

doi:10.3233/ia-240035

This article summarizes principles and ideas from the emerging area of applying conditional computation methods to the design of neural networks. In particular, we focus on neural networks that can dynamically activate or de-activate parts of their computational graph conditionally on their input. Examples include the dynamic selection of, e.g., input tokens, layers (or sets of layers), and sub-modules inside each layer (e.g., channels in a convolutional filter). We first provide a general formalism to describe these techniques in an uniform way. Then, we introduce three notable implementations of these principles: mixture-of-experts (MoEs) networks, token selection mechanisms, and early-exit neural networks. The paper aims to provide a tutorial-like introduction to this growing field. To this end, we analyze the benefits of these modular designs in terms of efficiency, explainability, and transfer learning, with a focus on emerging applicative areas ranging from automated scientific discovery to semantic communication.

Conditional computation in neural networks: Principles and research trends / Scardapane, Simone; Baiocchi, Alessandro; Devoto, Alessio; Marsocci, Valerio; Minervini, Pasquale; Pomponi, Jary. - In: INTELLIGENZA ARTIFICIALE. - ISSN 1724-8035. - 18:1(2024), pp. 175-190. [10.3233/ia-240035]

Conditional computation in neural networks: Principles and research trends

Scardapane, Simone;Baiocchi, Alessandro;Devoto, Alessio;Marsocci, Valerio;Minervini, Pasquale;Pomponi, Jary

2024

Abstract

This article summarizes principles and ideas from the emerging area of applying conditional computation methods to the design of neural networks. In particular, we focus on neural networks that can dynamically activate or de-activate parts of their computational graph conditionally on their input. Examples include the dynamic selection of, e.g., input tokens, layers (or sets of layers), and sub-modules inside each layer (e.g., channels in a convolutional filter). We first provide a general formalism to describe these techniques in an uniform way. Then, we introduce three notable implementations of these principles: mixture-of-experts (MoEs) networks, token selection mechanisms, and early-exit neural networks. The paper aims to provide a tutorial-like introduction to this growing field. To this end, we analyze the benefits of these modular designs in terms of efficiency, explainability, and transfer learning, with a focus on emerging applicative areas ranging from automated scientific discovery to semantic communication.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2024
			
	Parole chiave
	
				Conditional computation, neural networks, modularity, explainability, efficiency
			
	Tipologia
	
				01 Pubblicazione su rivista::01a Articolo in rivista
			
	Citazione
	
				Conditional computation in neural networks: Principles and research trends / Scardapane, Simone; Baiocchi, Alessandro; Devoto, Alessio; Marsocci, Valerio; Minervini, Pasquale; Pomponi, Jary. - In: INTELLIGENZA ARTIFICIALE. - ISSN 1724-8035. - 18:1(2024), pp. 175-190. [10.3233/ia-240035]

File allegati a questo prodotto

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1717183

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

ND

0

Catalogo dei prodotti della ricerca