We consider the problem of optimization in policy space for reinforcement learning. While a plethora of methods have been applied to this problem, only a narrow category of them proved feasible in robotics. We consider the peculiar characteristics of reinforcement learning in robotics, and devise a combination of two algorithms from the literature of derivative-free optimization. The proposed combination is well suited for robotics, as it involves both off-line learning in simulation and on-line learning in the real environment. We demonstrate our approach on a real-world task, where an Autonomous Underwater Vehicle has to survey a target area under potentially unknown environment conditions. We start from a given controller, which can perform the task under foreseeable conditions, and make it adaptive to the actual environment.

Combining local and global direct derivative-free optimization for reinforcement learning / Leonetti, Matteo; Kormushev, Petar; Sagratella, Simone. - In: CYBERNETICS AND INFORMATION TECHNOLOGIES. - ISSN 1311-9702. - 12:3(2012), pp. 53-65.

Combining local and global direct derivative-free optimization for reinforcement learning

LEONETTI, MATTEO;SAGRATELLA, SIMONE
2012

Abstract

We consider the problem of optimization in policy space for reinforcement learning. While a plethora of methods have been applied to this problem, only a narrow category of them proved feasible in robotics. We consider the peculiar characteristics of reinforcement learning in robotics, and devise a combination of two algorithms from the literature of derivative-free optimization. The proposed combination is well suited for robotics, as it involves both off-line learning in simulation and on-line learning in the real environment. We demonstrate our approach on a real-world task, where an Autonomous Underwater Vehicle has to survey a target area under potentially unknown environment conditions. We start from a given controller, which can perform the task under foreseeable conditions, and make it adaptive to the actual environment.
2012
Autonomous underwater vehicles; Derivative-free optimization; Policy search; Reinforcement learning; Robotics; Computer Science (all)
01 Pubblicazione su rivista::01a Articolo in rivista
Combining local and global direct derivative-free optimization for reinforcement learning / Leonetti, Matteo; Kormushev, Petar; Sagratella, Simone. - In: CYBERNETICS AND INFORMATION TECHNOLOGIES. - ISSN 1311-9702. - 12:3(2012), pp. 53-65.
File allegati a questo prodotto
File Dimensione Formato  
VE_2012_11573-944764.pdf

solo gestori archivio

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 965.68 kB
Formato Adobe PDF
965.68 kB Adobe PDF   Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/944764
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 14
  • ???jsp.display-item.citation.isi??? ND
social impact