In this paper, we present ASPEN, an answer set programming (ASP) implementation of a recently proposed declarative framework for collective entity resolution (ER). While an ASP encoding had been previously suggested, several practical issues had been neglected, most notably, the question of how to efficiently compute the (externally defined) similarity facts that are used in rule bodies. This leads us to propose new variants of the encodings (including Datalog approximations) and show how to employ different functionalities of ASP solvers to compute (maximal) solutions, and (approximations of) the sets of possible and certain merges. A comprehensive experimental evaluation of ASPEN on real-world datasets shows that the approach is promising, achieving high accuracy in real-life ER scenarios. Our experiments also yield useful insights into the relative merits of different types of (approximate) ER solutions, the impact of recursion, and factors influencing performance.

ASPEN: ASP-Based System for Collective Entity Resolution / Xiang, Zhiliang; Bienvenu, Meghyn; Cima, Gianluca; Gutierrez-Basulto, Victor; Ibanez-Garcia, Yazmin. - (2024), pp. 788-799. ( 21st International Conference on Principles of Knowledge Representation and Reasoning, KR 2024 Hanoi, Vietnam ).

ASPEN: ASP-Based System for Collective Entity Resolution

Gianluca Cima
;
2024

Abstract

In this paper, we present ASPEN, an answer set programming (ASP) implementation of a recently proposed declarative framework for collective entity resolution (ER). While an ASP encoding had been previously suggested, several practical issues had been neglected, most notably, the question of how to efficiently compute the (externally defined) similarity facts that are used in rule bodies. This leads us to propose new variants of the encodings (including Datalog approximations) and show how to employ different functionalities of ASP solvers to compute (maximal) solutions, and (approximations of) the sets of possible and certain merges. A comprehensive experimental evaluation of ASPEN on real-world datasets shows that the approach is promising, achieving high accuracy in real-life ER scenarios. Our experiments also yield useful insights into the relative merits of different types of (approximate) ER solutions, the impact of recursion, and factors influencing performance.
2024
21st International Conference on Principles of Knowledge Representation and Reasoning, KR 2024
entity resolution; answer set programming
04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
ASPEN: ASP-Based System for Collective Entity Resolution / Xiang, Zhiliang; Bienvenu, Meghyn; Cima, Gianluca; Gutierrez-Basulto, Victor; Ibanez-Garcia, Yazmin. - (2024), pp. 788-799. ( 21st International Conference on Principles of Knowledge Representation and Reasoning, KR 2024 Hanoi, Vietnam ).
File allegati a questo prodotto
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1768248
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? ND
social impact