Gradient descent is commonly used to find minima in rough landscapes, particularly in recent machine learning applications. However, a theoretical understanding of why good solutions are found remains elusive, especially in strongly non-convex and high-dimensional settings. Here, we focus on the phase retrieval problem as a typical example, which has received a lot of attention recently in theoretical machine learning. We analyze the Hessian during gradient descent, identify a dynamical transition in its spectral properties, and relate it to the ability of escaping rough regions in the loss landscape. When the signal-to-noise ratio (SNR) is large enough, an informative negative direction exists in the Hessian at the beginning of the descent, i.e in the initial condition. While descending, a Baik–Ben Arous–Pêché transition in the spectrum takes place in finite time: the direction is lost, and the dynamics is trapped in a rugged region filled with marginally stable bad minima. Surprisingly, for finite system sizes, this window of negative curvature allows the system to recover the signal well before the theoretical SNR found for infinite sizes, emphasizing the central role of initialization and early-time dynamics for efficiently navigating rough landscapes.

The role of the time-dependent Hessian in high-dimensional optimization / Bonnaire, Tony; Biroli, Giulio; Cammarota, Chiara. - In: JOURNAL OF STATISTICAL MECHANICS: THEORY AND EXPERIMENT. - ISSN 1742-5468. - 2025:8(2025), pp. 1-33. [10.1088/1742-5468/adf4bc]

The role of the time-dependent Hessian in high-dimensional optimization

Giulio Biroli;Chiara Cammarota
2025

Abstract

Gradient descent is commonly used to find minima in rough landscapes, particularly in recent machine learning applications. However, a theoretical understanding of why good solutions are found remains elusive, especially in strongly non-convex and high-dimensional settings. Here, we focus on the phase retrieval problem as a typical example, which has received a lot of attention recently in theoretical machine learning. We analyze the Hessian during gradient descent, identify a dynamical transition in its spectral properties, and relate it to the ability of escaping rough regions in the loss landscape. When the signal-to-noise ratio (SNR) is large enough, an informative negative direction exists in the Hessian at the beginning of the descent, i.e in the initial condition. While descending, a Baik–Ben Arous–Pêché transition in the spectrum takes place in finite time: the direction is lost, and the dynamics is trapped in a rugged region filled with marginally stable bad minima. Surprisingly, for finite system sizes, this window of negative curvature allows the system to recover the signal well before the theoretical SNR found for infinite sizes, emphasizing the central role of initialization and early-time dynamics for efficiently navigating rough landscapes.
2025
machine learning; inference; rough cost landscape; dynamics
01 Pubblicazione su rivista::01a Articolo in rivista
The role of the time-dependent Hessian in high-dimensional optimization / Bonnaire, Tony; Biroli, Giulio; Cammarota, Chiara. - In: JOURNAL OF STATISTICAL MECHANICS: THEORY AND EXPERIMENT. - ISSN 1742-5468. - 2025:8(2025), pp. 1-33. [10.1088/1742-5468/adf4bc]
File allegati a questo prodotto
File Dimensione Formato  
Bonnaire_The-role_2025.pdf

solo gestori archivio

Note: Articolo su rivista
Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 960.98 kB
Formato Adobe PDF
960.98 kB Adobe PDF   Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1767122
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact