Learning Small Decision Trees with Large Domain

Eiben, E.; Ordyniak, S.; Paesani, G.; Szeider, S.

doi:10.24963/ijcai.2023/355

One favors decision trees (DTs) of the smallest size or depth to facilitate explainability and interpretability. However, learning such an optimal DT from data is well-known to be NP-hard. To overcome this complexity barrier, Ordyniak and Szeider (AAAI 21) initiated the study of optimal DT learning under the parameterized complexity perspective. They showed that solution size (i.e., number of nodes or depth of the DT) is insufficient to obtain fixed-parameter tractability (FPT). Therefore, they proposed an FPT algorithm that utilizes two auxiliary parameters: the maximum difference (as a structural property of the data set) and maximum domain size. They left it as an open question of whether bounding the maximum domain size is necessary. The main result of this paper answers this question. We present FPT algorithms for learning a smallest or lowest-depth DT from data, with the only parameters solution size and maximum difference. Thus, our algorithm is significantly more potent than the one by Szeider and Ordyniak as it can handle problem inputs with features that range over unbounded domains. We also close several gaps concerning the quality of approximation one obtains by only considering DTs based on minimum support sets.

Learning Small Decision Trees with Large Domain / Eiben, E.; Ordyniak, S.; Paesani, G.; Szeider, S.. - 2023-:(2023), pp. 3184-3192. ( International Joint Conference on Artificial Intellingence Macao, Cina ) [10.24963/ijcai.2023/355].

Learning Small Decision Trees with Large Domain

Eiben E.;Ordyniak S.;Paesani G.;Szeider S.

2023

Abstract

One favors decision trees (DTs) of the smallest size or depth to facilitate explainability and interpretability. However, learning such an optimal DT from data is well-known to be NP-hard. To overcome this complexity barrier, Ordyniak and Szeider (AAAI 21) initiated the study of optimal DT learning under the parameterized complexity perspective. They showed that solution size (i.e., number of nodes or depth of the DT) is insufficient to obtain fixed-parameter tractability (FPT). Therefore, they proposed an FPT algorithm that utilizes two auxiliary parameters: the maximum difference (as a structural property of the data set) and maximum domain size. They left it as an open question of whether bounding the maximum domain size is necessary. The main result of this paper answers this question. We present FPT algorithms for learning a smallest or lowest-depth DT from data, with the only parameters solution size and maximum difference. Thus, our algorithm is significantly more potent than the one by Szeider and Ordyniak as it can handle problem inputs with features that range over unbounded domains. We also close several gaps concerning the quality of approximation one obtains by only considering DTs based on minimum support sets.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2023
			
	Nome convegno
	
				International Joint Conference on Artificial Intellingence
			
	Parole chiave
	
				Knowledge Representation and Reasoning; Computational complexity of reasoning Machine Learning; Explainable/Interpretable machine learning
			
	Tipologia
	
				04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
			
	Citazione
	
				Learning Small Decision Trees with Large Domain / Eiben, E.; Ordyniak, S.; Paesani, G.; Szeider, S.. - 2023-:(2023), pp. 3184-3192. ( International Joint Conference on Artificial Intellingence Macao, Cina ) [10.24963/ijcai.2023/355].

File allegati a questo prodotto

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1747581

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

14

ND

Catalogo dei prodotti della ricerca