Catalogo dei prodotti della ricerca

Relation Extraction (RE) is a task that identifies relationships between entities in a text, enabling the acquisition of relational facts and bridging the gap between natural language and structured knowledge. However, current RE models often rely on small datasets with low coverage of relation types, particularly when working with languages other than English.In this paper, we address the above issue and provide two new resources that enable the training and evaluation of multilingual RE systems.First, we present SREDFM, an automatically annotated dataset covering 18 languages, 400 relation types, 13 entity types, totaling more than 40 million triplet instances. Second, we propose REDFM, a smaller, human-revised dataset for seven languages that allows for the evaluation of multilingual RE systems. To demonstrate the utility of these novel datasets, we experiment with the first end-to-end multilingual RE model, mREBEL, that extracts triplets, including entity types, in multiple languages. We release our resources and model checkpoints at [https://www.github.com/babelscape/rebel](https://www.github.com/babelscape/rebel).

REDFM: a Filtered and Multilingual Relation Extraction Dataset / HUGUET CABOT, PERE-LLUIS; Tedeschi, Simone; Ngonga Ngomo, Axel-Cyrille; Navigli, Roberto. - 1:(2023), pp. 4326-4343. (Intervento presentato al convegno Association for Computational Linguistics tenutosi a Toronto, Canada).

REDFM: a Filtered and Multilingual Relation Extraction Dataset

Pere Lluis Huguet Cabot;Simone Tedeschi;Axel-Cyrille Ngonga Ngomo;Roberto Navigli

2023

Abstract

Relation Extraction (RE) is a task that identifies relationships between entities in a text, enabling the acquisition of relational facts and bridging the gap between natural language and structured knowledge. However, current RE models often rely on small datasets with low coverage of relation types, particularly when working with languages other than English.In this paper, we address the above issue and provide two new resources that enable the training and evaluation of multilingual RE systems.First, we present SREDFM, an automatically annotated dataset covering 18 languages, 400 relation types, 13 entity types, totaling more than 40 million triplet instances. Second, we propose REDFM, a smaller, human-revised dataset for seven languages that allows for the evaluation of multilingual RE systems. To demonstrate the utility of these novel datasets, we experiment with the first end-to-end multilingual RE model, mREBEL, that extracts triplets, including entity types, in multiple languages. We release our resources and model checkpoints at [https://www.github.com/babelscape/rebel](https://www.github.com/babelscape/rebel).

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2023
			
	Nome convegno
	
				Association for Computational Linguistics
			
	Parole chiave
	
				relation; multilingual; nlp; Annotated datasets; Entity-types; Extraction modeling; Extraction systems; Natural languages; Relation extraction; Relationships between entities; Small data set; Structured knowledge
			
	Tipologia
	
				04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
			
	Citazione
	
				REDFM: a Filtered and Multilingual Relation Extraction Dataset / HUGUET CABOT, PERE-LLUIS; Tedeschi, Simone; Ngonga Ngomo, Axel-Cyrille; Navigli, Roberto. - 1:(2023), pp. 4326-4343. (Intervento presentato al  convegno Association for Computational Linguistics tenutosi a Toronto, Canada).
			
	Appartiene alla tipologia:
	
				04b Atto di convegno in volume

File allegati a questo prodotto

File	Dimensione	Formato
Cabot_REDFM_2023.pdf solo gestori archivio Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 3.1 MB Formato Adobe PDF Contatta l'autore	3.1 MB	Adobe PDF	Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1686592

Citazioni

ND

12

2

social impact