Exploiting Symbolic Heuristics for the Synthesis of Domain-Specific Temporal Planning Guidance Using Reinforcement Learning

Brugnara, Irene; Valentini, Alessandro; Micheli, Andrea

doi:10.3233/faia251102

Recent work investigated the use of Reinforcement Learning (RL) for the synthesis of heuristic guidance to improve the performance of temporal planners when a domain is fixed and a set of training problems (not plans) is given. The idea is to extract a heuristic from the value function of a particular (possibly infinite-state) MDP constructed over the training problems. In this paper, we propose an evolution of this learning and planning framework that focuses on exploiting the information provided by symbolic heuristics during both the RL and planning phases. First, we formalize different reward schemata for the synthesis and use symbolic heuristics to mitigate the problems caused by the truncation of episodes needed to deal with the potentially infinite MDP. Second, we propose learning a residual of an existing symbolic heuristic, which is a "correction" of the heuristic value, instead of eagerly learning the whole heuristic from scratch. Finally, we use the learned heuristic in combination with a symbolic heuristic using a multiple-queue planning approach to balance systematic search with imperfect learned information. We experimentally compare all the approaches, highlighting their strengths and weaknesses and significantly advancing the state of the art for this planning and learning schema.

Exploiting Symbolic Heuristics for the Synthesis of Domain-Specific Temporal Planning Guidance Using Reinforcement Learning / Brugnara, Irene; Valentini, Alessandro; Micheli, Andrea. - 413:(2025), pp. 2530-2537. (Intervento presentato al convegno 28th European Conference on Artificial Intelligence (ECAI 2025) tenutosi a Bologna; Italy) [10.3233/faia251102].

Exploiting Symbolic Heuristics for the Synthesis of Domain-Specific Temporal Planning Guidance Using Reinforcement Learning

Brugnara, Irene;Valentini, Alessandro;Micheli, Andrea

2025

Abstract

Recent work investigated the use of Reinforcement Learning (RL) for the synthesis of heuristic guidance to improve the performance of temporal planners when a domain is fixed and a set of training problems (not plans) is given. The idea is to extract a heuristic from the value function of a particular (possibly infinite-state) MDP constructed over the training problems. In this paper, we propose an evolution of this learning and planning framework that focuses on exploiting the information provided by symbolic heuristics during both the RL and planning phases. First, we formalize different reward schemata for the synthesis and use symbolic heuristics to mitigate the problems caused by the truncation of episodes needed to deal with the potentially infinite MDP. Second, we propose learning a residual of an existing symbolic heuristic, which is a "correction" of the heuristic value, instead of eagerly learning the whole heuristic from scratch. Finally, we use the learned heuristic in combination with a symbolic heuristic using a multiple-queue planning approach to balance systematic search with imperfect learned information. We experimentally compare all the approaches, highlighting their strengths and weaknesses and significantly advancing the state of the art for this planning and learning schema.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2025
			
	Nome convegno
	
				28th European Conference on Artificial Intelligence (ECAI 2025)
			
	Parole chiave
	
				temporal planning; reinforcement learning; automated planning
			
	Tipologia
	
				04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
			
	Citazione
	
				Exploiting Symbolic Heuristics for the Synthesis of Domain-Specific Temporal Planning Guidance Using Reinforcement Learning / Brugnara, Irene; Valentini, Alessandro; Micheli, Andrea. - 413:(2025), pp. 2530-2537. (Intervento presentato al  convegno 28th European Conference on Artificial Intelligence (ECAI 2025) tenutosi a Bologna; Italy) [10.3233/faia251102].

File allegati a questo prodotto

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1755092

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

ND

ND

Catalogo dei prodotti della ricerca