Lexical-semantic resources such as wordnets and multilingual dictionaries often suffer from significant coverage issues, especially in languages other than English. While improving their coverage manually is a prohibitively expensive undertaking, current approaches to the automatic creation of such resources fail to investigate the latest advances achieved in relevant fields, such as cross-lingual annotation projection. In this work, we address these shortcomings and propose LEXICOMATIC, a novel resource-independent approach to the automatic construction and expansion of multilingual semantic dictionaries, in which we formulate the task as an annotation projection problem. In addition, we tackle the lack of a comprehensive multilingual evaluation framework and put forward a new entirely manually-curated benchmark featuring 9 languages. We evaluate LEXICOMATIC with an extensive array of experiments and demonstrate the effectiveness of our approach, achieving a new state of the art across all languages under consideration. We release our novel evaluation benchmark at: https://github.com/SapienzaNLP/lexicomatic.
LexicoMatic: Automatic Creation of Multilingual Lexical-Semantic Dictionaries / Martelli, Federico; Procopio, Luigi; Barba, Edoardo; Navigli, Roberto. - (2023), pp. 820-833. (Intervento presentato al convegno International Joint Conference on Natural Language Processing and Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics tenutosi a Bali) [10.18653/v1/2023.ijcnlp-main.53].
LexicoMatic: Automatic Creation of Multilingual Lexical-Semantic Dictionaries
Federico Martelli
Primo
;Luigi Procopio
Secondo
;Edoardo Barba
Penultimo
;Roberto Navigli
Ultimo
2023
Abstract
Lexical-semantic resources such as wordnets and multilingual dictionaries often suffer from significant coverage issues, especially in languages other than English. While improving their coverage manually is a prohibitively expensive undertaking, current approaches to the automatic creation of such resources fail to investigate the latest advances achieved in relevant fields, such as cross-lingual annotation projection. In this work, we address these shortcomings and propose LEXICOMATIC, a novel resource-independent approach to the automatic construction and expansion of multilingual semantic dictionaries, in which we formulate the task as an annotation projection problem. In addition, we tackle the lack of a comprehensive multilingual evaluation framework and put forward a new entirely manually-curated benchmark featuring 9 languages. We evaluate LEXICOMATIC with an extensive array of experiments and demonstrate the effectiveness of our approach, achieving a new state of the art across all languages under consideration. We release our novel evaluation benchmark at: https://github.com/SapienzaNLP/lexicomatic.File | Dimensione | Formato | |
---|---|---|---|
Martelli_Lexico-matic_2023.pdf
accesso aperto
Note: DOI: 10.18653/v1/2023.ijcnlp-main.53
Tipologia:
Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
387.32 kB
Formato
Adobe PDF
|
387.32 kB | Adobe PDF |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.