To date, the most successful word, word sense, and concept modelling techniques have used large corpora and knowledge resources to produce dense vector representations that capture semantic similarities in a relatively low-dimensional space. Most current approaches, however, suffer from a monolingual bias, with their strength depending on the amount of data available across languages. In this paper we address this issue and propose Conception, a novel technique for building language-independent vector representations of concepts which places multilinguality at its core while retaining explicit relationships between concepts. Our approach results in high-coverage representations that outperform the state of the art in multilingual and cross-lingual Semantic Word Similarity and Word Sense Disambiguation, proving particularly robust on low-resource languages. Conception – its software and the complete set of representations – is available at https://github.com/SapienzaNLP/conception.

Conception: Multilingually-Enhanced, Human-Readable Concept Vector Representations / Conia, Simone; Navigli, Roberto. - (2020), pp. 3268-3284. (Intervento presentato al convegno International Conference on Computational Linguistics tenutosi a Online) [10.18653/v1/2020.coling-main.291].

Conception: Multilingually-Enhanced, Human-Readable Concept Vector Representations

Conia, Simone
Primo
;
Navigli, Roberto
Ultimo
2020

Abstract

To date, the most successful word, word sense, and concept modelling techniques have used large corpora and knowledge resources to produce dense vector representations that capture semantic similarities in a relatively low-dimensional space. Most current approaches, however, suffer from a monolingual bias, with their strength depending on the amount of data available across languages. In this paper we address this issue and propose Conception, a novel technique for building language-independent vector representations of concepts which places multilinguality at its core while retaining explicit relationships between concepts. Our approach results in high-coverage representations that outperform the state of the art in multilingual and cross-lingual Semantic Word Similarity and Word Sense Disambiguation, proving particularly robust on low-resource languages. Conception – its software and the complete set of representations – is available at https://github.com/SapienzaNLP/conception.
2020
International Conference on Computational Linguistics
natural language processing; meaning representation; word sense disambiguation; concept representation; multilinguality; deep learning; artificial intelligence;
04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
Conception: Multilingually-Enhanced, Human-Readable Concept Vector Representations / Conia, Simone; Navigli, Roberto. - (2020), pp. 3268-3284. (Intervento presentato al convegno International Conference on Computational Linguistics tenutosi a Online) [10.18653/v1/2020.coling-main.291].
File allegati a questo prodotto
File Dimensione Formato  
Conia_Conception_2020.pdf

accesso aperto

Note: https://aclanthology.org/2020.coling-main.291.pdf
Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Creative commons
Dimensione 1.06 MB
Formato Adobe PDF
1.06 MB Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1494226
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 13
  • ???jsp.display-item.citation.isi??? ND
social impact