Catalogo dei prodotti della ricerca

Owing to the need for a deep understanding of linguistic items, semantic representation is considered to be one of the fundamental components of several applications in Natural Language Processing and Artificial Intelligence. As a result, semantic representation has been one of the prominent research areas in lexical semantics over the past decades. However, due mainly to the lack of large sense-annotated corpora, most existing representation techniques are limited to the lexical level and thus cannot be effectively applied to individual word senses. In this paper we put forward a novel multilingual vector representation, called NASARI, which not only enables accurate representation of word senses in different languages, but it also provides two main advantages over existing approaches: (1) high coverage, including both concepts and named entities, (2) comparability across languages and linguistic levels (i.e., words, senses and concepts), thanks to the representation of linguistic items in a single unified semantic space and in a joint embedded space, respectively. Moreover, our representations are flexible, can be applied to multiple applications and are freely available at http://lcl.uniroma1.it/nasari/. As evaluation benchmark, we opted for four different tasks, namely, word similarity, sense clustering, domain labeling, and Word Sense Disambiguation, for each of which we report state-of-the-art performance on several standard datasets across different languages. © 2016 Elsevier B.V.

Nasari: integrating explicit knowledge and corpus statistics for a multilingual representation of concepts and entities / CAMACHO COLLADOS, Jose'; Pilehvar, MOHAMMED TAHER; Navigli, Roberto. - In: ARTIFICIAL INTELLIGENCE. - ISSN 0004-3702. - ELETTRONICO. - 240:(2016), pp. 36-64. [10.1016/j.artint.2016.07.005]

Nasari: integrating explicit knowledge and corpus statistics for a multilingual representation of concepts and entities

CAMACHO COLLADOS, JOSE';PILEHVAR, MOHAMMED TAHER;NAVIGLI, Roberto

2016

Abstract

Owing to the need for a deep understanding of linguistic items, semantic representation is considered to be one of the fundamental components of several applications in Natural Language Processing and Artificial Intelligence. As a result, semantic representation has been one of the prominent research areas in lexical semantics over the past decades. However, due mainly to the lack of large sense-annotated corpora, most existing representation techniques are limited to the lexical level and thus cannot be effectively applied to individual word senses. In this paper we put forward a novel multilingual vector representation, called NASARI, which not only enables accurate representation of word senses in different languages, but it also provides two main advantages over existing approaches: (1) high coverage, including both concepts and named entities, (2) comparability across languages and linguistic levels (i.e., words, senses and concepts), thanks to the representation of linguistic items in a single unified semantic space and in a joint embedded space, respectively. Moreover, our representations are flexible, can be applied to multiple applications and are freely available at http://lcl.uniroma1.it/nasari/. As evaluation benchmark, we opted for four different tasks, namely, word similarity, sense clustering, domain labeling, and Word Sense Disambiguation, for each of which we report state-of-the-art performance on several standard datasets across different languages. © 2016 Elsevier B.V.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2016
			
	Parole chiave
	
				Domain labeling; Lexical semantics; Semantic representation; Semantic similarity; Sense clustering; Word Sense Disambiguation
			
	Tipologia
	
				01 Pubblicazione su rivista::01a Articolo in rivista
			
	Citazione
	
				Nasari: integrating explicit knowledge and corpus statistics for a multilingual representation of concepts and entities / CAMACHO COLLADOS, Jose'; Pilehvar, MOHAMMED TAHER; Navigli, Roberto. - In: ARTIFICIAL INTELLIGENCE. - ISSN 0004-3702. - ELETTRONICO. - 240:(2016), pp. 36-64. [10.1016/j.artint.2016.07.005]
			
	Appartiene alla tipologia:
	
				01a Articolo in rivista

File allegati a questo prodotto

File	Dimensione	Formato
Camacho-Collados_NASARI_2016.pdf solo gestori archivio Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 1.13 MB Formato Adobe PDF Contatta l'autore	1.13 MB	Adobe PDF	Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/975509

Citazioni

ND

164

121

social impact