Catalogo dei prodotti della ricerca

In this paper we present an automatic multilingual annotation of the Wikipedia dumps in two languages, with both word senses (i.e. concepts) and named entities. We use Babelfy 1.0, a state-of-the-art multilingual Word Sense Disambiguation and Entity Linking system. As its reference inventory, Babelfy draws upon BabelNet 3.0, a very large multilingual encyclopedic dictionary and semantic network which connects concepts and named entities in 271 languages from different inventories, such as WordNet, Open Multilingual WordNet, Wikipedia, OmegaWiki, Wiktionary and Wikidata. In addition, we perform both an automatic evaluation of the dataset and a language-specific statistical analysis. In detail, we investigate the word sense distributions by part-of-speech and language, together with the similarity of the annotated entities and concepts for a random sample of interlinked Wikipedia pages in different languages. The annotated corpora are available at http://lcl.uniroma1.it/babelfied-wikipedia/.

Automatic identification and disambiguation of concepts and named entities in the Multilingual Wikipedia / Scozzafava, Federico; Raganato, Alessandro; Moro, Andrea; Navigli, Roberto. - STAMPA. - (2015), pp. 357-366. (Intervento presentato al convegno AI*IA tenutosi a Ferrara, Italy nel 23-25 September 2015) [10.1007/978-3-319-24309-2_27].

Automatic identification and disambiguation of concepts and named entities in the Multilingual Wikipedia

SCOZZAFAVA, FEDERICO;raganato, alessandro;MORO, ANDREA;NAVIGLI, ROBERTO

2015

Abstract

In this paper we present an automatic multilingual annotation of the Wikipedia dumps in two languages, with both word senses (i.e. concepts) and named entities. We use Babelfy 1.0, a state-of-the-art multilingual Word Sense Disambiguation and Entity Linking system. As its reference inventory, Babelfy draws upon BabelNet 3.0, a very large multilingual encyclopedic dictionary and semantic network which connects concepts and named entities in 271 languages from different inventories, such as WordNet, Open Multilingual WordNet, Wikipedia, OmegaWiki, Wiktionary and Wikidata. In addition, we perform both an automatic evaluation of the dataset and a language-specific statistical analysis. In detail, we investigate the word sense distributions by part-of-speech and language, together with the similarity of the annotated entities and concepts for a random sample of interlinked Wikipedia pages in different languages. The annotated corpora are available at http://lcl.uniroma1.it/babelfied-wikipedia/.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2015
			
	Nome convegno
	
				AI*IA
			
	Parole chiave
	
				Semantic annotation; Named entities; Word  senses; Disambiguation
			
	Tipologia
	
				04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
			
	Citazione
	
				Automatic identification and disambiguation of concepts and named entities in the Multilingual Wikipedia / Scozzafava, Federico; Raganato, Alessandro; Moro, Andrea; Navigli, Roberto. - STAMPA. - (2015), pp. 357-366. (Intervento presentato al  convegno AI*IA tenutosi a Ferrara, Italy nel 23-25 September 2015) [10.1007/978-3-319-24309-2_27].
			
	Appartiene alla tipologia:
	
				04b Atto di convegno in volume

File allegati a questo prodotto

File	Dimensione	Formato
Scozzafava_Automatic_2015.pdf solo gestori archivio Tipologia: Documento in Post-print (versione successiva alla peer review e accettata per la pubblicazione) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 479.77 kB Formato Adobe PDF Contatta l'autore	479.77 kB	Adobe PDF	Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/854988

Citazioni

ND

9

6

social impact