Catalogo dei prodotti della ricerca

Non-markovian Reinforcement Learning (RL) tasks are extremely hard to solve, because intelligent agents must consider the entire history of state-action pairs to act rationally in the environment. Most works use Linear Temporal Logic (LTL) to specify temporally-extended tasks. This approach applies only in finite and discrete state environments or continuous problems for which a mapping between the continuous state and a symbolic interpretation is known as a symbol grounding function. In this work, we define Visual Reward Machines (VRM), an automata-based neurosymbolic framework that can be used for both reasoning and learning in non-symbolic non-markovian RL domains. VRM is a fully neural but interpretable system, that is based on the probabilistic relaxation of Moore Machines. Results show that VRMs can exploit ungrounded symbolic temporal knowledge to outperform baseline methods based on RNNs in non-markovian RL tasks.

Visual reward machines / Umili, Elena; Argenziano, Francesco; Barbin, Aymeric; Capobianco, Roberto. - 3432:(2023), pp. 255-267. (Intervento presentato al convegno 17th International Workshop on Neural-Symbolic Learning and Reasoning tenutosi a La Certosa di Pontignano (SI); Italy).

Visual reward machines

Elena Umili^Primo;Francesco Argenziano^Secondo;Aymeric Barbin;Roberto Capobianco^Ultimo

2023

Abstract

Non-markovian Reinforcement Learning (RL) tasks are extremely hard to solve, because intelligent agents must consider the entire history of state-action pairs to act rationally in the environment. Most works use Linear Temporal Logic (LTL) to specify temporally-extended tasks. This approach applies only in finite and discrete state environments or continuous problems for which a mapping between the continuous state and a symbolic interpretation is known as a symbol grounding function. In this work, we define Visual Reward Machines (VRM), an automata-based neurosymbolic framework that can be used for both reasoning and learning in non-symbolic non-markovian RL domains. VRM is a fully neural but interpretable system, that is based on the probabilistic relaxation of Moore Machines. Results show that VRMs can exploit ungrounded symbolic temporal knowledge to outperform baseline methods based on RNNs in non-markovian RL tasks.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2023
			
	Nome convegno
	
				17th International Workshop on Neural-Symbolic Learning and Reasoning
			
	Parole chiave
	
				non-markovian reinforcement learning; neurosymbolic ai; symbol grounding; deep reinforcement learning
			
	Tipologia
	
				04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
			
	Citazione
	
				Visual reward machines / Umili, Elena; Argenziano, Francesco; Barbin, Aymeric; Capobianco, Roberto. - 3432:(2023), pp. 255-267. (Intervento presentato al  convegno 17th International Workshop on Neural-Symbolic Learning and Reasoning tenutosi a La Certosa di Pontignano (SI); Italy).
			
	Appartiene alla tipologia:
	
				04b Atto di convegno in volume

File allegati a questo prodotto

File	Dimensione	Formato
Umili_Visual_2023.pdf accesso aperto Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore) Licenza: Creative commons Dimensione 1.87 MB Formato Adobe PDF	1.87 MB	Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1684340

Citazioni

ND

4

0

social impact