Training agents over sequences of tasks is often employed in deep reinforcement learning to let the agents progress more quickly towards better behaviours. This problem, known as curriculum learning, has been mainly tackled in the literature by numerical methods based on enumeration strategies, which, however, can handle only small size problems. In this work, we define a new optimization perspective to the curriculum learning problem with the aim of developing efficient solution methods for solving complex reinforcement learning tasks. Specifically, we show how the curriculum learning problem can be viewed as an optimization problem with a nonsmooth and nonconvex objective function and with an integer feasible region. We reformulate it by defining a grey-box function that includes a suitable scheduling problem. Numerical results on a benchmark environment in the reinforcement learning community show the effectiveness of the proposed approaches in reaching better performance also on large problems.

A novel optimization perspective to the problem of designing sequences of tasks in a reinforcement learning framework / Seccia, R.; Foglino, F.; Leonetti, M.; Sagratella, S.. - In: OPTIMIZATION AND ENGINEERING. - ISSN 1389-4420. - 24:2(2022), pp. 831-846. [10.1007/s11081-021-09708-x]

A novel optimization perspective to the problem of designing sequences of tasks in a reinforcement learning framework

Seccia R.;Sagratella S.
2022

Abstract

Training agents over sequences of tasks is often employed in deep reinforcement learning to let the agents progress more quickly towards better behaviours. This problem, known as curriculum learning, has been mainly tackled in the literature by numerical methods based on enumeration strategies, which, however, can handle only small size problems. In this work, we define a new optimization perspective to the curriculum learning problem with the aim of developing efficient solution methods for solving complex reinforcement learning tasks. Specifically, we show how the curriculum learning problem can be viewed as an optimization problem with a nonsmooth and nonconvex objective function and with an integer feasible region. We reformulate it by defining a grey-box function that includes a suitable scheduling problem. Numerical results on a benchmark environment in the reinforcement learning community show the effectiveness of the proposed approaches in reaching better performance also on large problems.
2022
(Deep) reinforcement learning; black-box optimization; curriculum learning; sequential model-based optimization
01 Pubblicazione su rivista::01a Articolo in rivista
A novel optimization perspective to the problem of designing sequences of tasks in a reinforcement learning framework / Seccia, R.; Foglino, F.; Leonetti, M.; Sagratella, S.. - In: OPTIMIZATION AND ENGINEERING. - ISSN 1389-4420. - 24:2(2022), pp. 831-846. [10.1007/s11081-021-09708-x]
File allegati a questo prodotto
File Dimensione Formato  
Seccia_ANovel_2022.pdf

solo gestori archivio

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 1.32 MB
Formato Adobe PDF
1.32 MB Adobe PDF   Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1645381
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact