We analyze the problem of storing random pattern-label associations using two classes of continuous non-convex weights models, namely the perceptron with negative margin and an infinite-width two-layer neural network with non-overlapping receptive fields and generic activation function. Using a full-RSB Ansatz we compute the exact value of the SAT/UNSAT transition. Furthermore, in the case of the negative perceptron we show that the overlap distribution of typical states displays an overlap gap (a disconnected support) in certain regions of the phase diagram defined by the value of the margin and the density of patterns to be stored. This implies that some recent theorems that ensure convergence of Approximate Message Passing (AMP) based algorithms to capacity are not applicable. Finally, we show that Gradient Descent is not able to reach the maximal capacity, irrespectively of the presence of an overlap gap for typical states. This finding, similarly to what occurs in binary weight models, suggests that gradient-based algorithms are biased towards highly atypical states, whose inaccessibility determines the algorithmic threshold.

Exact full-RSB SAT/UNSAT transition in infinitely wide two-layer neural networks / Annesi, Brandon L.; Malatesta, Enrico; Zamponi, Francesco. - In: SCIPOST PHYSICS. - ISSN 2542-4653. - 18:4(2025), pp. 1-39. [10.21468/SciPostPhys.18.4.118]

Exact full-RSB SAT/UNSAT transition in infinitely wide two-layer neural networks

Brandon L. Annesi;Enrico Malatesta
;
Francesco Zamponi
2025

Abstract

We analyze the problem of storing random pattern-label associations using two classes of continuous non-convex weights models, namely the perceptron with negative margin and an infinite-width two-layer neural network with non-overlapping receptive fields and generic activation function. Using a full-RSB Ansatz we compute the exact value of the SAT/UNSAT transition. Furthermore, in the case of the negative perceptron we show that the overlap distribution of typical states displays an overlap gap (a disconnected support) in certain regions of the phase diagram defined by the value of the margin and the density of patterns to be stored. This implies that some recent theorems that ensure convergence of Approximate Message Passing (AMP) based algorithms to capacity are not applicable. Finally, we show that Gradient Descent is not able to reach the maximal capacity, irrespectively of the presence of an overlap gap for typical states. This finding, similarly to what occurs in binary weight models, suggests that gradient-based algorithms are biased towards highly atypical states, whose inaccessibility determines the algorithmic threshold.
2025
glass transition; fluctuations; statistical mechanics
01 Pubblicazione su rivista::01a Articolo in rivista
Exact full-RSB SAT/UNSAT transition in infinitely wide two-layer neural networks / Annesi, Brandon L.; Malatesta, Enrico; Zamponi, Francesco. - In: SCIPOST PHYSICS. - ISSN 2542-4653. - 18:4(2025), pp. 1-39. [10.21468/SciPostPhys.18.4.118]
File allegati a questo prodotto
File Dimensione Formato  
Annesi_Exact-full-RSB_2025.pdf

accesso aperto

Note: Articolo su rivista
Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Creative commons
Dimensione 1.25 MB
Formato Adobe PDF
1.25 MB Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1745572
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 1
social impact