Semantic representation lies at the core of computational lexical semantics, which is a key research field in Natural Language Processing. Because of the need for a deeper understanding of linguistic units, semantic representation is considered to be one of the fundamental components of several applications in Natural Language Processing and Artificial Intelligence. However, due mainly to the lack of large sense-annotated corpora, most existing representation techniques are limited to the lexical level and thus cannot be effectively applied to individual word senses. In this paper we put forward a novel multilingual vector representation, called Nasari, which not only enables accurate representation of word senses in different languages, but it also provides two main advantages over existing approaches: (1) high coverage, including both concepts and named entities, (2) comparability across languages and linguistic levels (i.e. words, senses and concepts), thanks to the representation of linguistic items in a single unified semantic space and in a joint embedded space, respectively. Moreover, our representations are flexible, can be applied to multiple applications and are freely available at http://lcl.uniroma1.it/nasari/. As evaluation benchmark, we opted for four different tasks, namely, word similarity, sense clustering, domain labeling, and Word Sense Disambiguation, for each of which we report state-of-the-art performance on several standard datasets across different languages.

NASARI: Multilingual Semantically-grounded Distributional Vectors / CAMACHO COLLADOS, Jose'; Pilehvar, MOHAMMED TAHER; Navigli, Roberto. - In: ARTIFICIAL INTELLIGENCE. - ISSN 0004-3702. - ELETTRONICO. - (In corso di stampa).

NASARI: Multilingual Semantically-grounded Distributional Vectors

CAMACHO COLLADOS, JOSE';PILEHVAR, MOHAMMED TAHER;NAVIGLI, ROBERTO
In corso di stampa

Abstract

Semantic representation lies at the core of computational lexical semantics, which is a key research field in Natural Language Processing. Because of the need for a deeper understanding of linguistic units, semantic representation is considered to be one of the fundamental components of several applications in Natural Language Processing and Artificial Intelligence. However, due mainly to the lack of large sense-annotated corpora, most existing representation techniques are limited to the lexical level and thus cannot be effectively applied to individual word senses. In this paper we put forward a novel multilingual vector representation, called Nasari, which not only enables accurate representation of word senses in different languages, but it also provides two main advantages over existing approaches: (1) high coverage, including both concepts and named entities, (2) comparability across languages and linguistic levels (i.e. words, senses and concepts), thanks to the representation of linguistic items in a single unified semantic space and in a joint embedded space, respectively. Moreover, our representations are flexible, can be applied to multiple applications and are freely available at http://lcl.uniroma1.it/nasari/. As evaluation benchmark, we opted for four different tasks, namely, word similarity, sense clustering, domain labeling, and Word Sense Disambiguation, for each of which we report state-of-the-art performance on several standard datasets across different languages.
9999
01 Pubblicazione su rivista::01a Articolo in rivista
NASARI: Multilingual Semantically-grounded Distributional Vectors / CAMACHO COLLADOS, Jose'; Pilehvar, MOHAMMED TAHER; Navigli, Roberto. - In: ARTIFICIAL INTELLIGENCE. - ISSN 0004-3702. - ELETTRONICO. - (In corso di stampa).
File allegati a questo prodotto
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/870336
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact