Mainstream computational lexical semantics embraces the assumption that word senses can be represented as discrete items of a predefined inventory. In this paper we show this needs not be the case, and propose a unified model that is able to produce contextually appropriate definitions. In our model, Generationary, we employ a novel span-based encoding scheme which we use to fine-tune an English pre-trained Encoder-Decoder system to generate glosses. We show that, even though we drop the need of choosing from a predefined sense inventory, our model can be employed effectively: not only does Generationary outperform previous approaches in the generative task of Definition Modeling in many settings, but it also matches or surpasses the state of the art in discriminative tasks such as Word Sense Disambiguation and Word-inContext. Finally, we show that Generationary benefits from training on data from multiple inventories, with strong gains on various zeroshot benchmarks, including a novel dataset of definitions for free adjective-noun phrases. The software and reproduction materials are available at http://generationary.org.
Generationary or “How We Went beyond Word Sense Inventories and Learned to Gloss” / Bevilacqua, Michele; Maru, Marco; Navigli, Roberto. - (2020), pp. 7207-7221. (Intervento presentato al convegno The 2020 Conference on Empirical Methods in Natural Language Processing tenutosi a Online) [10.18653/v1/2020.emnlp-main.585].
Generationary or “How We Went beyond Word Sense Inventories and Learned to Gloss”
Bevilacqua, Michele
;Maru, Marco
;Navigli, Roberto
2020
Abstract
Mainstream computational lexical semantics embraces the assumption that word senses can be represented as discrete items of a predefined inventory. In this paper we show this needs not be the case, and propose a unified model that is able to produce contextually appropriate definitions. In our model, Generationary, we employ a novel span-based encoding scheme which we use to fine-tune an English pre-trained Encoder-Decoder system to generate glosses. We show that, even though we drop the need of choosing from a predefined sense inventory, our model can be employed effectively: not only does Generationary outperform previous approaches in the generative task of Definition Modeling in many settings, but it also matches or surpasses the state of the art in discriminative tasks such as Word Sense Disambiguation and Word-inContext. Finally, we show that Generationary benefits from training on data from multiple inventories, with strong gains on various zeroshot benchmarks, including a novel dataset of definitions for free adjective-noun phrases. The software and reproduction materials are available at http://generationary.org.File | Dimensione | Formato | |
---|---|---|---|
Bevilacqua_Generationary_2020.pdf
accesso aperto
Tipologia:
Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza:
Creative commons
Dimensione
516.62 kB
Formato
Adobe PDF
|
516.62 kB | Adobe PDF | Visualizza/Apri PDF |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.