Efficient robotic behaviors require robustness and adaptation to dynamic changes of the environment, whose characteristics rapidly vary during robot operation. To generate effective robot action policies, planning and learning techniques have shown the most promising results. However, if considered individually, they present different limitations. Planning techniques lack generalization among similar states and require experts to define behavioral routines at different levels of abstraction. Conversely, learning methods usually require a considerable number of training samples and iterations of the algorithm. To overcome these issues, and to efficiently generate robot behaviors, we introduce LOOP, an iterative learning algorithm for optimistic planning that combines state-of-the-art planning and learning techniques to generate action policies. The main contribution of LOOP is the combination of Monte-Carlo Search Planning and Q-learning, which enables focused exploration during policy refinement in different robotic applications. We demonstrate the robustness and flexibility of LOOP in various domains and multiple robotic platforms, by validating the proposed approach with an extensive experimental evaluation.

LOOP: Iterative learning for optimistic planning on robots / Riccio, F.; Capobianco, R.; Nardi, D.. - In: ROBOTICS AND AUTONOMOUS SYSTEMS. - ISSN 0921-8890. - 136:(2021). [10.1016/j.robot.2020.103693]

LOOP: Iterative learning for optimistic planning on robots

Riccio F.
;
Capobianco R.
;
Nardi D.
2021

Abstract

Efficient robotic behaviors require robustness and adaptation to dynamic changes of the environment, whose characteristics rapidly vary during robot operation. To generate effective robot action policies, planning and learning techniques have shown the most promising results. However, if considered individually, they present different limitations. Planning techniques lack generalization among similar states and require experts to define behavioral routines at different levels of abstraction. Conversely, learning methods usually require a considerable number of training samples and iterations of the algorithm. To overcome these issues, and to efficiently generate robot behaviors, we introduce LOOP, an iterative learning algorithm for optimistic planning that combines state-of-the-art planning and learning techniques to generate action policies. The main contribution of LOOP is the combination of Monte-Carlo Search Planning and Q-learning, which enables focused exploration during policy refinement in different robotic applications. We demonstrate the robustness and flexibility of LOOP in various domains and multiple robotic platforms, by validating the proposed approach with an extensive experimental evaluation.
2021
Autonomous planning and learning; Deep robot reinforcement learning; Monte-Carlo planning; Q-learning
01 Pubblicazione su rivista::01a Articolo in rivista
LOOP: Iterative learning for optimistic planning on robots / Riccio, F.; Capobianco, R.; Nardi, D.. - In: ROBOTICS AND AUTONOMOUS SYSTEMS. - ISSN 0921-8890. - 136:(2021). [10.1016/j.robot.2020.103693]
File allegati a questo prodotto
File Dimensione Formato  
Riccio_LOOP_2021.pdf

solo gestori archivio

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 3.8 MB
Formato Adobe PDF
3.8 MB Adobe PDF   Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1486009
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? 2
social impact