Catalogo dei prodotti della ricerca

Deep reinforcement learning (DRL) models have shown great promise in various applications, but their practical adoption in critical domains is limited due to their opaque decision-making processes. To address this challenge, explainable AI (XAI) techniques aim to enhance transparency and interpretability of black-box models. However, most current interpretable systems focus on supervised learning problems, leaving reinforcement learning relatively unexplored. This paper extends the work of PW-Net, an interpretable wrapper model for DRL agents inspired by image classification methodologies. We introduce Shared-PW-Net, an interpretable deep learning model that features a fully trainable prototype layer. Unlike PW-Net, Shared-PW-Net does not rely on pre-existing prototypes. Instead, it leverages the concept of ProtoPool to automatically learn general prototypes assigned to actions during training. Additionally, we propose a novel prototype initialization method that significantly improves the model’s performance. Through extensive experimentation, we demonstrate that our Shared-PW-Net achieves the same reward performance as existing methods without requiring human intervention. Our model’s fully trainable prototype layer, coupled with the innovative prototype initialization approach, contributes to a clearer and more interpretable decision-making process. The code for this work is publicly available for further exploration and applications.

Understanding Deep RL agent decisions: a novel interpretable approach with trainable prototypes / Borzillo, Caterina; Ragno, Alessio; Capobianco, Roberto. - (2023). (Intervento presentato al convegno XAI.it 2023: Italian Workshop on Explainable Artificial Intelligence 2023 tenutosi a Rome).

Understanding Deep RL agent decisions: a novel interpretable approach with trainable prototypes

Caterina Borzillo;Alessio Ragno;Roberto Capobianco

2023

Abstract

Deep reinforcement learning (DRL) models have shown great promise in various applications, but their practical adoption in critical domains is limited due to their opaque decision-making processes. To address this challenge, explainable AI (XAI) techniques aim to enhance transparency and interpretability of black-box models. However, most current interpretable systems focus on supervised learning problems, leaving reinforcement learning relatively unexplored. This paper extends the work of PW-Net, an interpretable wrapper model for DRL agents inspired by image classification methodologies. We introduce Shared-PW-Net, an interpretable deep learning model that features a fully trainable prototype layer. Unlike PW-Net, Shared-PW-Net does not rely on pre-existing prototypes. Instead, it leverages the concept of ProtoPool to automatically learn general prototypes assigned to actions during training. Additionally, we propose a novel prototype initialization method that significantly improves the model’s performance. Through extensive experimentation, we demonstrate that our Shared-PW-Net achieves the same reward performance as existing methods without requiring human intervention. Our model’s fully trainable prototype layer, coupled with the innovative prototype initialization approach, contributes to a clearer and more interpretable decision-making process. The code for this work is publicly available for further exploration and applications.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
			2023
		
	Nome convegno
	
			XAI.it 2023: Italian Workshop on Explainable Artificial Intelligence 2023
		
	Parole chiave
	
			interpretable deep learning, reinforcement learning, explainable artificial intelligence
		
	Tipologia
	
			04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
		
	Citazione
	
			Understanding Deep RL agent decisions: a novel interpretable approach with trainable prototypes / Borzillo, Caterina; Ragno, Alessio; Capobianco, Roberto. - (2023). (Intervento presentato al  convegno XAI.it 2023: Italian Workshop on Explainable Artificial Intelligence 2023 tenutosi a Rome).

File allegati a questo prodotto

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1696769

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

0

ND

social impact