To date, the most successful word, word sense, and concept modelling techniques have used large corpora and knowledge resources to produce dense vector representations that capture semantic similarities in a relatively low-dimensional space. Most current approaches, however, suffer from a monolingual bias, with their strength depending on the amount of data available across languages. In this paper we address this issue and propose Conception, a novel technique for building language-independent vector representations of concepts which places multilinguality at its core while retaining explicit relationships between concepts. Our approach results in high-coverage representations that outperform the state of the art in multilingual and cross-lingual Semantic Word Similarity and Word Sense Disambiguation, proving particularly robust on low-resource languages. Conception – its software and the complete set of representations – is available at https://github.com/SapienzaNLP/conception.
Conception: Multilingually-Enhanced, Human-Readable Concept Vector Representations / Conia, Simone; Navigli, Roberto. - (2020), pp. 3268-3284. (Intervento presentato al convegno International Conference on Computational Linguistics tenutosi a Online) [10.18653/v1/2020.coling-main.291].
Conception: Multilingually-Enhanced, Human-Readable Concept Vector Representations
Conia, Simone
Primo
;Navigli, Roberto
Ultimo
2020
Abstract
To date, the most successful word, word sense, and concept modelling techniques have used large corpora and knowledge resources to produce dense vector representations that capture semantic similarities in a relatively low-dimensional space. Most current approaches, however, suffer from a monolingual bias, with their strength depending on the amount of data available across languages. In this paper we address this issue and propose Conception, a novel technique for building language-independent vector representations of concepts which places multilinguality at its core while retaining explicit relationships between concepts. Our approach results in high-coverage representations that outperform the state of the art in multilingual and cross-lingual Semantic Word Similarity and Word Sense Disambiguation, proving particularly robust on low-resource languages. Conception – its software and the complete set of representations – is available at https://github.com/SapienzaNLP/conception.File | Dimensione | Formato | |
---|---|---|---|
Conia_Conception_2020.pdf
accesso aperto
Note: https://aclanthology.org/2020.coling-main.291.pdf
Tipologia:
Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza:
Creative commons
Dimensione
1.06 MB
Formato
Adobe PDF
|
1.06 MB | Adobe PDF |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.