Mainstream computational lexical semantics embraces the assumption that word senses can be represented as discrete items of a predefined inventory. In this paper we show this needs not be the case, and propose a unified model that is able to produce contextually appropriate definitions. In our model, Generationary, we employ a novel span-based encoding scheme which we use to fine-tune an English pre-trained Encoder-Decoder system to generate glosses. We show that, even though we drop the need of choosing from a predefined sense inventory, our model can be employed effectively: not only does Generationary outperform previous approaches in the generative task of Definition Modeling in many settings, but it also matches or surpasses the state of the art in discriminative tasks such as Word Sense Disambiguation and Word-inContext. Finally, we show that Generationary benefits from training on data from multiple inventories, with strong gains on various zeroshot benchmarks, including a novel dataset of definitions for free adjective-noun phrases. The software and reproduction materials are available at http://generationary.org.

Generationary or “How We Went beyond Word Sense Inventories and Learned to Gloss” / Bevilacqua, Michele; Maru, Marco; Navigli, Roberto. - (2020), pp. 7207-7221. (Intervento presentato al convegno The 2020 Conference on Empirical Methods in Natural Language Processing tenutosi a Online) [10.18653/v1/2020.emnlp-main.585].

Generationary or “How We Went beyond Word Sense Inventories and Learned to Gloss”

Bevilacqua, Michele
;
Maru, Marco
;
Navigli, Roberto
2020

Abstract

Mainstream computational lexical semantics embraces the assumption that word senses can be represented as discrete items of a predefined inventory. In this paper we show this needs not be the case, and propose a unified model that is able to produce contextually appropriate definitions. In our model, Generationary, we employ a novel span-based encoding scheme which we use to fine-tune an English pre-trained Encoder-Decoder system to generate glosses. We show that, even though we drop the need of choosing from a predefined sense inventory, our model can be employed effectively: not only does Generationary outperform previous approaches in the generative task of Definition Modeling in many settings, but it also matches or surpasses the state of the art in discriminative tasks such as Word Sense Disambiguation and Word-inContext. Finally, we show that Generationary benefits from training on data from multiple inventories, with strong gains on various zeroshot benchmarks, including a novel dataset of definitions for free adjective-noun phrases. The software and reproduction materials are available at http://generationary.org.
2020
The 2020 Conference on Empirical Methods in Natural Language Processing
definition modeling; word sense disambiguation; natural language generation; word-in-context; computational lexical-semantics; natural language processing; sense inventory
04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
Generationary or “How We Went beyond Word Sense Inventories and Learned to Gloss” / Bevilacqua, Michele; Maru, Marco; Navigli, Roberto. - (2020), pp. 7207-7221. (Intervento presentato al convegno The 2020 Conference on Empirical Methods in Natural Language Processing tenutosi a Online) [10.18653/v1/2020.emnlp-main.585].
File allegati a questo prodotto
File Dimensione Formato  
Bevilacqua_Generationary_2020.pdf

accesso aperto

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Creative commons
Dimensione 516.62 kB
Formato Adobe PDF
516.62 kB Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1465840
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 37
  • ???jsp.display-item.citation.isi??? ND
social impact