The lexical substitution task aims at generating a list of suitable replacements for a target word in context, ideally keeping the meaning of the modified text unchanged. While its usage has increased in recent years, the paucity of annotated data prevents the finetuning of neural models on the task, hindering the full fruition of recently introduced powerful architectures such as language models. Furthermore, lexical substitution is usually evaluated in a framework that is strictly bound to a limited vocabulary, making it impossible to credit appropriate, but out-of-vocabulary, substitutes. To assess these issues, we proposed GeneSis (Generating Substitutes in contexts), the first generative approach to lexical substitution. Thanks to a seq2seq model, we generate substitutes for a word according to the context it appears in, attaining state-of-the-art results on different benchmarks. Moreover, our approach allows silver data to be produced for further improving the performances of lexical substitution systems. Along with an extensive analysis of GeneSis results, we also present a human evaluation of the generated substitutes in order to assess their quality. We release the fine-tuned models, the generated datasets, and the code to reproduce the experiments at https://github.com/SapienzaNLP/genesis.

GeneSis: A Generative Approach to Substitutes in Context / Lacerra, Caterina; Tripodi, Rocco; Navigli, Roberto. - (2021), pp. 10810-10823. (Intervento presentato al convegno 2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021 tenutosi a Punta Cana; Dominican Republic) [10.18653/v1/2021.emnlp-main.844].

GeneSis: A Generative Approach to Substitutes in Context

Lacerra, Caterina
;
Tripodi, Rocco;Navigli, Roberto
2021

Abstract

The lexical substitution task aims at generating a list of suitable replacements for a target word in context, ideally keeping the meaning of the modified text unchanged. While its usage has increased in recent years, the paucity of annotated data prevents the finetuning of neural models on the task, hindering the full fruition of recently introduced powerful architectures such as language models. Furthermore, lexical substitution is usually evaluated in a framework that is strictly bound to a limited vocabulary, making it impossible to credit appropriate, but out-of-vocabulary, substitutes. To assess these issues, we proposed GeneSis (Generating Substitutes in contexts), the first generative approach to lexical substitution. Thanks to a seq2seq model, we generate substitutes for a word according to the context it appears in, attaining state-of-the-art results on different benchmarks. Moreover, our approach allows silver data to be produced for further improving the performances of lexical substitution systems. Along with an extensive analysis of GeneSis results, we also present a human evaluation of the generated substitutes in order to assess their quality. We release the fine-tuned models, the generated datasets, and the code to reproduce the experiments at https://github.com/SapienzaNLP/genesis.
2021
2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021
natural language processing; lexical semantics; lexical substitution
04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
GeneSis: A Generative Approach to Substitutes in Context / Lacerra, Caterina; Tripodi, Rocco; Navigli, Roberto. - (2021), pp. 10810-10823. (Intervento presentato al convegno 2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021 tenutosi a Punta Cana; Dominican Republic) [10.18653/v1/2021.emnlp-main.844].
File allegati a questo prodotto
File Dimensione Formato  
Lacerra_GENESIS_2021.pdf

accesso aperto

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Creative commons
Dimensione 1.3 MB
Formato Adobe PDF
1.3 MB Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1604115
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 8
  • ???jsp.display-item.citation.isi??? 2
social impact