Catalogo dei prodotti della ricerca

We present a framework based on bilevel opti- mization for learning multilayer, deep data rep- resentations. On the one hand, the lower-level problem finds a representation by successively minimizing layer-wise objectives made of the sum of a prescribed regularizer, a fidelity term and a linear function depending on the representation found at the previous layer. On the other hand, the upper-level problem optimizes over the lin- ear functions to yield a linearly separable final representation. We show that, by choosing the fi- delity term as the quadratic distance between two successive layer-wise representations, the bilevel problem reduces to the training of a feedforward neural network. Instead, by elaborating on Breg- man distances, we devise a novel neural network architecture additionally involving the inverse of the activation function reminiscent of the skip connection used in ResNets. Numerical experi- ments suggest that the proposed Bregman variant benefits from better learning properties and more robust prediction performance.

Bregman Neural Networks / Frecon, Jordan; Gasso, Gilles; Pontil, Massimiliano; Salzo, Saverio. - PMLR 162:(2022). (Intervento presentato al convegno International Conference on Machine Learning tenutosi a Baltimore; Maryland; USA).

Bregman Neural Networks

Jordan Frecon;Gilles Gasso;Massimiliano Pontil;Saverio Salzo

2022

Abstract

We present a framework based on bilevel opti- mization for learning multilayer, deep data rep- resentations. On the one hand, the lower-level problem finds a representation by successively minimizing layer-wise objectives made of the sum of a prescribed regularizer, a fidelity term and a linear function depending on the representation found at the previous layer. On the other hand, the upper-level problem optimizes over the lin- ear functions to yield a linearly separable final representation. We show that, by choosing the fi- delity term as the quadratic distance between two successive layer-wise representations, the bilevel problem reduces to the training of a feedforward neural network. Instead, by elaborating on Breg- man distances, we devise a novel neural network architecture additionally involving the inverse of the activation function reminiscent of the skip connection used in ResNets. Numerical experi- ments suggest that the proposed Bregman variant benefits from better learning properties and more robust prediction performance.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2022
			
	Nome convegno
	
				International Conference on Machine Learning
			
	Parole chiave
	
				deep neural network; Bregman proximity operator
			
	Tipologia
	
				04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
			
	Citazione
	
				Bregman Neural Networks / Frecon, Jordan; Gasso, Gilles; Pontil, Massimiliano; Salzo, Saverio. - PMLR 162:(2022). (Intervento presentato al  convegno International Conference on Machine Learning tenutosi a Baltimore; Maryland; USA).
			
	Appartiene alla tipologia:
	
				04b Atto di convegno in volume

File allegati a questo prodotto

File	Dimensione	Formato
Frecon_Bregman_2022.pdf accesso aperto Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 739.51 kB Formato Adobe PDF	739.51 kB	Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1675277

Citazioni

ND

4

2

social impact