In this paper, we present a comparative analysis of the leading rule- based information extraction systems in both research and industry, focusing on their main characteristics and their performance. Our evaluation was performed on a dataset of text documents about financial product descriptions from a real-world application scenario. In this study, we demonstrate that, while the considered tools share similarities in terms of expressiveness of their extractors and produce results of comparable quality, the implementation choices of their engines have a substantial impact on their overall execution time. Moreover, we emphasize that some of the considered tools offer seamless support for writing extraction rules, effectively addressing one of the common challenges associated with rule-based approaches.

Comparing state of the art rule-based tools for information extraction / Scafoglieri, Federico; Lembo, Domenico. - 14244:(2023), pp. 157-165. (Intervento presentato al convegno International Joint Conference on Rules and Reasoning, RuleML+RR 2023 tenutosi a Oslo) [10.1007/978-3-031-45072-3_11].

Comparing state of the art rule-based tools for information extraction

Federico Scafoglieri
;
Domenico Lembo
2023

Abstract

In this paper, we present a comparative analysis of the leading rule- based information extraction systems in both research and industry, focusing on their main characteristics and their performance. Our evaluation was performed on a dataset of text documents about financial product descriptions from a real-world application scenario. In this study, we demonstrate that, while the considered tools share similarities in terms of expressiveness of their extractors and produce results of comparable quality, the implementation choices of their engines have a substantial impact on their overall execution time. Moreover, we emphasize that some of the considered tools offer seamless support for writing extraction rules, effectively addressing one of the common challenges associated with rule-based approaches.
2023
International Joint Conference on Rules and Reasoning, RuleML+RR 2023
Information extraction; benchmarks; rule-based systems
04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
Comparing state of the art rule-based tools for information extraction / Scafoglieri, Federico; Lembo, Domenico. - 14244:(2023), pp. 157-165. (Intervento presentato al convegno International Joint Conference on Rules and Reasoning, RuleML+RR 2023 tenutosi a Oslo) [10.1007/978-3-031-45072-3_11].
File allegati a questo prodotto
File Dimensione Formato  
Lembo_Comparing_2023.pdf

solo gestori archivio

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 2.29 MB
Formato Adobe PDF
2.29 MB Adobe PDF   Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1690454
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact