Explainability, Quantified: Benchmarking XAI Techniques

Perotti, Alan; Borile, Claudio; Miola, Arianna; Nerini, FRANCESCO PAOLO; Baracco, Paolo; Panisson, André

doi:10.1007/978-3-031-63787-2_22

Modern Machine Learning (ML) has significantly advanced various fields; yet, the challenge of understanding complex models, often referred to as the “black box problem”, remains a barrier to their widespread adoption, particularly in critical domains such as medical diagnosis and financial services. Explainable AI (XAI) addresses this challenge by augmenting ML models’ outputs with interpretable information to facilitate human understanding of their internal decision processes. Despite the proliferation of explainers in recent years, covering a wide range of ML tasks and explanation types, there is no consensus on what constitutes a good explanation, leaving ML practitioners without clear guidance for selecting appropriate explainers. We argue that explanation quality quantification is the enabling factor for informed explainer choices, but many proposed explanation evaluation criteria are either narrow in scope or closer to desired properties than quantifiable metrics. This paper addresses this gap by proposing a standardized set of metrics for quantitatively evaluating explanations across diverse explanation types and ML tasks. We describe in detail the metrics of Effective Compactness, Rank Quality Index and Stability, designed to assess quantitatively explanation quality for various types of explanations (attributions, counterfactuals and rules) across different ML tasks (classification, regression and anomaly detection). We then present an exhaustive benchmarking framework for tabular-based ML, comprising open datasets, trained models, and state-of-the-art explainers. For each (data, model, explainer) tuple, we measure the time of the explanation production, apply our metrics and collect the results, highlighting correlations and trade-offs between desired properties. The resulting framework allows us to quantitatively rank explainers suitable for specific ML scenarios and select the most appropriate one based on the user’s requirements.

Explainability, Quantified: Benchmarking XAI Techniques / Perotti, Alan; Borile, Claudio; Miola, Arianna; Nerini, FRANCESCO PAOLO; Baracco, Paolo; Panisson, André. - 2153 CCIS:(2024), pp. 421-444. (Intervento presentato al convegno Explainable Artificial Intelligence tenutosi a Valletta, Malta) [10.1007/978-3-031-63787-2_22].

Explainability, Quantified: Benchmarking XAI Techniques

Alan Perotti;Claudio Borile;Arianna Miola;Francesco Paolo Nerini;Paolo Baracco;André Panisson

2024

Abstract

Modern Machine Learning (ML) has significantly advanced various fields; yet, the challenge of understanding complex models, often referred to as the “black box problem”, remains a barrier to their widespread adoption, particularly in critical domains such as medical diagnosis and financial services. Explainable AI (XAI) addresses this challenge by augmenting ML models’ outputs with interpretable information to facilitate human understanding of their internal decision processes. Despite the proliferation of explainers in recent years, covering a wide range of ML tasks and explanation types, there is no consensus on what constitutes a good explanation, leaving ML practitioners without clear guidance for selecting appropriate explainers. We argue that explanation quality quantification is the enabling factor for informed explainer choices, but many proposed explanation evaluation criteria are either narrow in scope or closer to desired properties than quantifiable metrics. This paper addresses this gap by proposing a standardized set of metrics for quantitatively evaluating explanations across diverse explanation types and ML tasks. We describe in detail the metrics of Effective Compactness, Rank Quality Index and Stability, designed to assess quantitatively explanation quality for various types of explanations (attributions, counterfactuals and rules) across different ML tasks (classification, regression and anomaly detection). We then present an exhaustive benchmarking framework for tabular-based ML, comprising open datasets, trained models, and state-of-the-art explainers. For each (data, model, explainer) tuple, we measure the time of the explanation production, apply our metrics and collect the results, highlighting correlations and trade-offs between desired properties. The resulting framework allows us to quantitatively rank explainers suitable for specific ML scenarios and select the most appropriate one based on the user’s requirements.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2024
			
	Nome convegno
	
				Explainable Artificial Intelligence
			
	Parole chiave
	
				artificial intelligence; explainability;
			
	Tipologia
	
				04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
			
	Citazione
	
				Explainability, Quantified: Benchmarking XAI Techniques / Perotti, Alan; Borile, Claudio; Miola, Arianna; Nerini, FRANCESCO PAOLO; Baracco, Paolo; Panisson, André. - 2153 CCIS:(2024), pp. 421-444. (Intervento presentato al  convegno Explainable Artificial Intelligence tenutosi a Valletta, Malta) [10.1007/978-3-031-63787-2_22].

File allegati a questo prodotto

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1725346

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

1

0

Catalogo dei prodotti della ricerca