Hi-Val: Iterative Learning of Hierarchical Value Functions for Policy Generation

Capobianco, Roberto; Riccio, Francesco; Nardi, Daniele

doi:10.1007/978-3-030-01370-7_33

Task decomposition is effective in manifold applications where the global complexity of a problem makes planning and decision-making too demanding. This is true, for example, in high-dimensional robotics domains, where (1) unpredictabilities and modeling limitations typically prevent the manual specification of robust behaviors, and (2) learning an action policy is challenging due to the curse of dimensionality. In this work, we borrow the concept of Hierarchical Task Networks (HTNs) to decompose the learning procedure, and we exploit Upper Confidence Tree (UCT) search to introduce HOP, a novel iterative algorithm for hierarchical optimistic planning with learned value functions. To obtain better generalization and generate policies, HOP simultaneously learns and uses action values. These are used to formalize constraints within the search space and to reduce the dimensionality of the problem. We evaluate our algorithm both on a fetching task using a simulated 7-DOF KUKA light weight arm and, on a pick and delivery task with a Pioneer robot.

Hi-Val: Iterative Learning of Hierarchical Value Functions for Policy Generation / Capobianco, Roberto; Riccio, Francesco; Nardi, Daniele. - 867:(2019), pp. 414-427. (Intervento presentato al convegno 15th International Conference on Intelligent Autonomous Systems, IAS 2018 tenutosi a Baden-Baden; Germany) [10.1007/978-3-030-01370-7_33].

Hi-Val: Iterative Learning of Hierarchical Value Functions for Policy Generation

Roberto Capobianco^Primo;Francesco Riccio^Secondo;Daniele Nardi^Ultimo

2019

Abstract

Task decomposition is effective in manifold applications where the global complexity of a problem makes planning and decision-making too demanding. This is true, for example, in high-dimensional robotics domains, where (1) unpredictabilities and modeling limitations typically prevent the manual specification of robust behaviors, and (2) learning an action policy is challenging due to the curse of dimensionality. In this work, we borrow the concept of Hierarchical Task Networks (HTNs) to decompose the learning procedure, and we exploit Upper Confidence Tree (UCT) search to introduce HOP, a novel iterative algorithm for hierarchical optimistic planning with learned value functions. To obtain better generalization and generate policies, HOP simultaneously learns and uses action values. These are used to formalize constraints within the search space and to reduce the dimensionality of the problem. We evaluate our algorithm both on a fetching task using a simulated 7-DOF KUKA light weight arm and, on a pick and delivery task with a Pioneer robot.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2019
			
	Nome convegno
	
				15th International Conference on Intelligent Autonomous Systems, IAS 2018
			
	Parole chiave
	
				Robot Planning; Robot Learning; Hierarchical Value Function Learning
			
	Tipologia
	
				04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
			
	Citazione
	
				Hi-Val: Iterative Learning of Hierarchical Value Functions for Policy Generation / Capobianco, Roberto; Riccio, Francesco; Nardi, Daniele. - 867:(2019), pp. 414-427. (Intervento presentato al  convegno 15th International Conference on Intelligent Autonomous Systems, IAS 2018 tenutosi a Baden-Baden; Germany) [10.1007/978-3-030-01370-7_33].
			
	Appartiene alla tipologia:
	
				04b Atto di convegno in volume

File allegati a questo prodotto

File	Dimensione	Formato
Capobianco_Preprint_HI-VAL_2019.pdf accesso aperto Tipologia: Documento in Pre-print (manoscritto inviato all'editore, precedente alla peer review) Licenza: Creative commons Dimensione 857.69 kB Formato Adobe PDF	857.69 kB	Adobe PDF
Capobianco_HI-VAL_2019.pdf solo gestori archivio Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 1.13 MB Formato Adobe PDF Contatta l'autore	1.13 MB	Adobe PDF	Contatta l'autore
Capobianco_Frontespizio-indice_HI-VAL_2019.pdf solo gestori archivio Tipologia: Altro materiale allegato Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 193.98 kB Formato Adobe PDF Contatta l'autore	193.98 kB	Adobe PDF	Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1132271

Citazioni

ND

1

0

Catalogo dei prodotti della ricerca