Entity resolution (ER) seeks to identify which records in a data set refer to the same real-world entity. Given the diversity of ways in which entities can be represented, ER is known to be a chal- lenging task for automated strategies, but relatively easier for expert humans. Nonetheless, also humans can make mistakes. Our contribution is an error correction toolkit that can be leveraged by a variety of hybrid human-machine ER algorithms, based on a formal way for selecting “control queries” for the human experts. We demonstrate empirically that less recent ER algorithms equipped with our tool can perform even better than most recent ER methods with built-in error correction.

Crowdsourced entity resolution with control queries / Galhotra, Sainyam; Firmani, Donatella; Saha, Barna; Srivastava, Divesh. - 2400:(2019). (Intervento presentato al convegno 27th Italian Symposium on Advanced Database Systems, SEBD 2019 tenutosi a Grosseto; Italy).

Crowdsourced entity resolution with control queries

Firmani Donatella;
2019

Abstract

Entity resolution (ER) seeks to identify which records in a data set refer to the same real-world entity. Given the diversity of ways in which entities can be represented, ER is known to be a chal- lenging task for automated strategies, but relatively easier for expert humans. Nonetheless, also humans can make mistakes. Our contribution is an error correction toolkit that can be leveraged by a variety of hybrid human-machine ER algorithms, based on a formal way for selecting “control queries” for the human experts. We demonstrate empirically that less recent ER algorithms equipped with our tool can perform even better than most recent ER methods with built-in error correction.
2019
27th Italian Symposium on Advanced Database Systems, SEBD 2019
data integration, entity resolution, data cleaning
04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
Crowdsourced entity resolution with control queries / Galhotra, Sainyam; Firmani, Donatella; Saha, Barna; Srivastava, Divesh. - 2400:(2019). (Intervento presentato al convegno 27th Italian Symposium on Advanced Database Systems, SEBD 2019 tenutosi a Grosseto; Italy).
File allegati a questo prodotto
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1638629
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact