Interpreting deep learning models for entity resolution: an experience report using LIME

Di Cicco Vincenzo; Firmani Donatella; Koudas Nick; Merialdo Paolo; Srivastava Divesh

doi:10.1145/3329859.3329878

Entity Resolution (ER) seeks to understand which records refer to the same entity (e.g., matching products sold on multiple websites). The sheer number of ways humans represent and misrepresent information about real-world entities makes ER a challenging problem. Deep Learning (DL) has provided impressive results in the field of natural language processing, thus recent works started exploring DL approaches to the ER problem, with encouraging results. However, we are still far from understanding why and when these approaches work in the ER setting. We are developing a methodology, Mojito, to produce explainable interpretations of the output of DL models for the ER task. Our methodology is based on LIME, a popular tool for producing prediction explanations for generic classification tasks. In this paper we report our first experiences in interpreting recent DL models for the ER task. Our results demonstrate the importance of explanations in the DL space, and suggest that, when assessing performance of DL algorithms for ER, accuracy alone may not be sufficient to demonstrate generality and reproducibility in a production environment.

Interpreting deep learning models for entity resolution: an experience report using LIME / Di Cicco, Vincenzo; Firmani, Donatella; Koudas, Nick; Merialdo, Paolo; Srivastava, Divesh. - (2019), pp. 1-4. (Intervento presentato al convegno Second International Workshop on Exploiting Artificial Intelligence Techniques for Data Management associated with the 2019 ACM SIGMOD/PODS Conference, aiDM@SIGMOD 2019 tenutosi a Amsterdam, The Netherlands) [10.1145/3329859.3329878].

Interpreting deep learning models for entity resolution: an experience report using LIME

Di Cicco Vincenzo;Firmani Donatella;Koudas Nick;Merialdo Paolo;Srivastava Divesh

2019

Abstract

Entity Resolution (ER) seeks to understand which records refer to the same entity (e.g., matching products sold on multiple websites). The sheer number of ways humans represent and misrepresent information about real-world entities makes ER a challenging problem. Deep Learning (DL) has provided impressive results in the field of natural language processing, thus recent works started exploring DL approaches to the ER problem, with encouraging results. However, we are still far from understanding why and when these approaches work in the ER setting. We are developing a methodology, Mojito, to produce explainable interpretations of the output of DL models for the ER task. Our methodology is based on LIME, a popular tool for producing prediction explanations for generic classification tasks. In this paper we report our first experiences in interpreting recent DL models for the ER task. Our results demonstrate the importance of explanations in the DL space, and suggest that, when assessing performance of DL algorithms for ER, accuracy alone may not be sufficient to demonstrate generality and reproducibility in a production environment.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2019
			
	Nome convegno
	
				Second International Workshop on Exploiting Artificial Intelligence Techniques for Data Management associated with the 2019 ACM SIGMOD/PODS Conference, aiDM@SIGMOD 2019
			
	Parole chiave
	
				Entity Resolution (ER) seeks to understand which records refer to the same entity (e.g., matching products sold on multiple websites). The sheer number of ways humans represent and misrepresent information about real-world entities makes ER a challenging problem. Deep Learning (DL) has provided impressive results in the field of natural language processing, thus recent works started exploring DL approaches to the ER problem, with encouraging results. However, we are still far from understanding why and when these approaches work in the ER setting. We are developing a methodology, Mojito, to produce explainable interpretations of the output of DL models for the ER task. Our methodology is based on LIME, a popular tool for producing prediction explanations for generic classification tasks. In this paper we report our first experiences in interpreting recent DL models for the ER task. Our results demonstrate the importance of explanations in the DL space, and suggest that, when assessing performance of DL algorithms for ER, accuracy alone may not be sufficient to demonstrate generality and reproducibility in a production environment.
			
	Tipologia
	
				04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
			
	Citazione
	
				Interpreting deep learning models for entity resolution: an experience report using LIME / Di Cicco, Vincenzo; Firmani, Donatella; Koudas, Nick; Merialdo, Paolo; Srivastava, Divesh. - (2019), pp. 1-4. (Intervento presentato al  convegno Second International Workshop on Exploiting Artificial Intelligence Techniques for Data Management associated with the 2019 ACM SIGMOD/PODS Conference, aiDM@SIGMOD 2019 tenutosi a Amsterdam, The Netherlands) [10.1145/3329859.3329878].

File allegati a questo prodotto

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1640582

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

30

ND

Catalogo dei prodotti della ricerca