In many high-dimensional estimation problems the main task consists in minimizing a cost function, which is often strongly non-convex when scanned in the space of parameters to be estimated. A standard solution to flatten the corresponding rough landscape consists in summing the losses associated to different data points and obtaining a smoother empirical risk. Here we propose a complementary method that works for a single data point. The main idea is that a large amount of the roughness is uncorrelated in different parts of the landscape. One can then substantially reduce the noise by evaluating an empirical average of the gradient obtained as a sum over many random independent positions in the space of parameters to be optimized. We present an algorithm, called averaged gradient descent, based on this idea and we apply it to tensor PCA, which is a very hard estimation problem. We show that averaged gradient descent over-performs physical algorithms such as gradient descent and approximate message passing and matches the best algorithmic thresholds known so far, obtained by tensor unfolding and methods based on sum-of-squares.

How to iron out rough landscapes and get optimal performances: Averaged gradient descent and its application to tensor PCA / Biroli, G.; Cammarota, C.; Ricci-Tersenghi, F.. - In: JOURNAL OF PHYSICS. A, MATHEMATICAL AND THEORETICAL. - ISSN 1751-8113. - 53:17(2020). [10.1088/1751-8121/ab7b1f]

How to iron out rough landscapes and get optimal performances: Averaged gradient descent and its application to tensor PCA

Biroli G.;Cammarota C.;Ricci-Tersenghi F.
2020

Abstract

In many high-dimensional estimation problems the main task consists in minimizing a cost function, which is often strongly non-convex when scanned in the space of parameters to be estimated. A standard solution to flatten the corresponding rough landscape consists in summing the losses associated to different data points and obtaining a smoother empirical risk. Here we propose a complementary method that works for a single data point. The main idea is that a large amount of the roughness is uncorrelated in different parts of the landscape. One can then substantially reduce the noise by evaluating an empirical average of the gradient obtained as a sum over many random independent positions in the space of parameters to be optimized. We present an algorithm, called averaged gradient descent, based on this idea and we apply it to tensor PCA, which is a very hard estimation problem. We show that averaged gradient descent over-performs physical algorithms such as gradient descent and approximate message passing and matches the best algorithmic thresholds known so far, obtained by tensor unfolding and methods based on sum-of-squares.
2020
averaged gradient descent; gradient descent; tensor PCA
01 Pubblicazione su rivista::01a Articolo in rivista
How to iron out rough landscapes and get optimal performances: Averaged gradient descent and its application to tensor PCA / Biroli, G.; Cammarota, C.; Ricci-Tersenghi, F.. - In: JOURNAL OF PHYSICS. A, MATHEMATICAL AND THEORETICAL. - ISSN 1751-8113. - 53:17(2020). [10.1088/1751-8121/ab7b1f]
File allegati a questo prodotto
File Dimensione Formato  
Biroli_How to iron out rough landscapes_2020.pdf

solo gestori archivio

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 595.13 kB
Formato Adobe PDF
595.13 kB Adobe PDF   Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1436114
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 14
  • ???jsp.display-item.citation.isi??? 11
social impact