We analyse a general class of bilevel problems, in which the upper-level problem consists in the minimization of a smooth objective function and the lower-level problem is to find the fixed point of a smooth contraction map. This type of problems include instances of meta-learning, equilibrium models, hyperparameter optimization and data poisoning adversarial attacks. Several recent works have proposed algorithms which warm-start the lower-level problem, i.e. they use the previous lower-level approximate solution as a staring point for the lower-level solver. This warm-start procedure allows one to improve the sample complexity in both the stochastic and deterministic settings, achieving in some cases the order-wise optimal sample complexity. However, there are situations, e.g., meta learning and equilibrium models, in which the warm-start procedure is not well-suited or ineffective. In this work we show that without warm-start, it is still possible to achieve order-wise (near) optimal sample complexity. In particular, we propose a simple method which uses (stochastic) fixed point iterations at the lower-level and projected inexact gradient descent at the upper-level, that reaches an ε-stationary point using O(ε−2) and O ̃(ε−1) samples for the stochastic and the deterministic setting, respectively. Finally, compared to methods using warm-start, our approach yields a simpler analysis that does not need to study the coupled interactions between the upper-level and lower-level iterates.

Bilevel Optimization with a Lower-level Contraction: Optimal Sample Complexity without Warm-Start / Grazzi, Riccardo; Pontil, Massimiliano; Salzo, Saverio. - In: JOURNAL OF MACHINE LEARNING RESEARCH. - ISSN 1532-4435. - 24:(2023), pp. 1-37.

Bilevel Optimization with a Lower-level Contraction: Optimal Sample Complexity without Warm-Start

Saverio Salzo
2023

Abstract

We analyse a general class of bilevel problems, in which the upper-level problem consists in the minimization of a smooth objective function and the lower-level problem is to find the fixed point of a smooth contraction map. This type of problems include instances of meta-learning, equilibrium models, hyperparameter optimization and data poisoning adversarial attacks. Several recent works have proposed algorithms which warm-start the lower-level problem, i.e. they use the previous lower-level approximate solution as a staring point for the lower-level solver. This warm-start procedure allows one to improve the sample complexity in both the stochastic and deterministic settings, achieving in some cases the order-wise optimal sample complexity. However, there are situations, e.g., meta learning and equilibrium models, in which the warm-start procedure is not well-suited or ineffective. In this work we show that without warm-start, it is still possible to achieve order-wise (near) optimal sample complexity. In particular, we propose a simple method which uses (stochastic) fixed point iterations at the lower-level and projected inexact gradient descent at the upper-level, that reaches an ε-stationary point using O(ε−2) and O ̃(ε−1) samples for the stochastic and the deterministic setting, respectively. Finally, compared to methods using warm-start, our approach yields a simpler analysis that does not need to study the coupled interactions between the upper-level and lower-level iterates.
2023
bilevel optimization; stochastic optimization; hyperparameter optimization; meta-learning
01 Pubblicazione su rivista::01a Articolo in rivista
Bilevel Optimization with a Lower-level Contraction: Optimal Sample Complexity without Warm-Start / Grazzi, Riccardo; Pontil, Massimiliano; Salzo, Saverio. - In: JOURNAL OF MACHINE LEARNING RESEARCH. - ISSN 1532-4435. - 24:(2023), pp. 1-37.
File allegati a questo prodotto
File Dimensione Formato  
Grazzi_Bilevel_2023.pdf

accesso aperto

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Creative commons
Dimensione 606.65 kB
Formato Adobe PDF
606.65 kB Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1681338
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? 0
social impact