In this work, we consider minimizing the average of a very large number of smooth and possibly non-convex functions, and we focus on two widely used minibatch frameworks to tackle this optimization problem: incremental gradient (IG) and random reshuffling (RR). We define ease-controlled modifications of the IG/RR schemes, which require a light additional computational effort but can be proved to converge to a stationary point under weak and standard assumptions. In particular, we define two algorithmic schemes in which the IG/RR iteration is controlled by using a watchdog rule and a derivative-free linesearch that activates only sporadically to adjust the stepsize so to guarantee convergence. The two schemes differ in the watchdog rule and the linesearch, which are performed using either a monotonic or a non-monotonic rule. The two schemes allow controlling the updating of the stepsize used in the main IG/RR iteration, avoiding the use of pre-set rules that may drive the stepsize to zero too fast, reducing the effort in designing effective updating rules of the stepsize. We perform computational analysis using different deep neural architectures and a benchmark of varying-size datasets. We compare our implementation with both a full batch gradient method (i.e. L-BFGS) and a fair implementation of IG/RR methods, proving that our algorithms require a similar computational effort compared to the other online algorithms and that the control on the learning rate may allow a faster decrease of the objective function.

Convergence of ease-controlled Random Reshuffling gradient Algorithms under Lipschitz smoothness / Seccia, Ruggiero; Coppola, Corrado; Liuzzi, Giampaolo; Palagi, Laura. - In: COMPUTATIONAL OPTIMIZATION AND APPLICATIONS. - ISSN 1573-2894. - 91:2(2025), pp. 933-971. [10.1007/s10589-025-00667-y]

Convergence of ease-controlled Random Reshuffling gradient Algorithms under Lipschitz smoothness

Ruggiero Seccia;Corrado Coppola
;
Giampaolo Liuzzi;Laura Palagi
2025

Abstract

In this work, we consider minimizing the average of a very large number of smooth and possibly non-convex functions, and we focus on two widely used minibatch frameworks to tackle this optimization problem: incremental gradient (IG) and random reshuffling (RR). We define ease-controlled modifications of the IG/RR schemes, which require a light additional computational effort but can be proved to converge to a stationary point under weak and standard assumptions. In particular, we define two algorithmic schemes in which the IG/RR iteration is controlled by using a watchdog rule and a derivative-free linesearch that activates only sporadically to adjust the stepsize so to guarantee convergence. The two schemes differ in the watchdog rule and the linesearch, which are performed using either a monotonic or a non-monotonic rule. The two schemes allow controlling the updating of the stepsize used in the main IG/RR iteration, avoiding the use of pre-set rules that may drive the stepsize to zero too fast, reducing the effort in designing effective updating rules of the stepsize. We perform computational analysis using different deep neural architectures and a benchmark of varying-size datasets. We compare our implementation with both a full batch gradient method (i.e. L-BFGS) and a fair implementation of IG/RR methods, proving that our algorithms require a similar computational effort compared to the other online algorithms and that the control on the learning rate may allow a faster decrease of the objective function.
2025
Finite-sum; Lipschitz smooth; Minibatch method; Non-monotone schemes
01 Pubblicazione su rivista::01a Articolo in rivista
Convergence of ease-controlled Random Reshuffling gradient Algorithms under Lipschitz smoothness / Seccia, Ruggiero; Coppola, Corrado; Liuzzi, Giampaolo; Palagi, Laura. - In: COMPUTATIONAL OPTIMIZATION AND APPLICATIONS. - ISSN 1573-2894. - 91:2(2025), pp. 933-971. [10.1007/s10589-025-00667-y]
File allegati a questo prodotto
File Dimensione Formato  
Seccia_Convergence_2025.pdf

accesso aperto

Note: https://link.springer.com/content/pdf/10.1007/s10589-025-00667-y.pdf
Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Creative commons
Dimensione 902.05 kB
Formato Adobe PDF
902.05 kB Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1725648
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 1
social impact