Catalogo dei prodotti della ricerca

Non-markovian Reinforcement Learning (RL) tasks are very hard to solve, because agents must consider the entire history of state-action pairs to act rationally in the environment. Most works use symbolic formalisms (as Linear Temporal Logic or automata) to specify the temporally-extended task. These approaches only work in finite and discrete state environments or continuous problems for which a mapping between the raw state and a symbolic interpretation is known as a symbol grounding (SG) function. Here, we define Neural Reward Machines (NRM), an automata-based neurosymbolic framework that can be used for both reasoning and learning in non-symbolic non-markovian RL domains, which is based on the probabilistic relaxation of Moore Machines. We combine RL with semisupervised symbol grounding (SSSG) and we show that NRMs can exploit high- level symbolic knowledge in non-symbolic environments without any knowledge of the SG function, outperforming Deep RL methods which cannot incorporate prior knowledge. Moreover, we advance the research in SSSG, proposing an algorithm for analysing the groundability of temporal specifications, which is more efficient than baseline techniques of a factor 10^3.

Neural reward machines / Umili, E., Argenziano, F., Capobianco, R.. - 392:(2024), pp. 3055-3062. (European Conference on Artificial Intelligence Santiago de Compostela; Spain ) [10.3233/FAIA240847].

Neural reward machines

Elena Umili^Primo;Francesco Argenziano^Secondo;Roberto Capobianco^Ultimo

2024

Abstract

Non-markovian Reinforcement Learning (RL) tasks are very hard to solve, because agents must consider the entire history of state-action pairs to act rationally in the environment. Most works use symbolic formalisms (as Linear Temporal Logic or automata) to specify the temporally-extended task. These approaches only work in finite and discrete state environments or continuous problems for which a mapping between the raw state and a symbolic interpretation is known as a symbol grounding (SG) function. Here, we define Neural Reward Machines (NRM), an automata-based neurosymbolic framework that can be used for both reasoning and learning in non-symbolic non-markovian RL domains, which is based on the probabilistic relaxation of Moore Machines. We combine RL with semisupervised symbol grounding (SSSG) and we show that NRMs can exploit high- level symbolic knowledge in non-symbolic environments without any knowledge of the SG function, outperforming Deep RL methods which cannot incorporate prior knowledge. Moreover, we advance the research in SSSG, proposing an algorithm for analysing the groundability of temporal specifications, which is more efficient than baseline techniques of a factor 10^3.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2024
			
	Nome convegno
	
				European Conference on Artificial Intelligence
			
	Parole chiave
	
				reinforcement learning; neurosymbolic learning; symbol grounding
			
	Tipologia
	
				04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
			
	Citazione
	
				Neural reward machines / Umili, E., Argenziano, F., Capobianco, R.. - 392:(2024), pp. 3055-3062. (European Conference on Artificial Intelligence Santiago de Compostela; Spain ) [10.3233/FAIA240847].
			
	Appartiene alla tipologia:
	
				04b Atto di convegno in volume

File allegati a questo prodotto

File	Dimensione	Formato
Umili_Neural_2024.pdf accesso aperto Note: https://ebooks.iospress.nl/doi/10.3233/FAIA240847 Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore) Licenza: Creative commons Dimensione 731.06 kB Formato Adobe PDF	731.06 kB	Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1725807

Citazioni

ND

5

2

social impact