Catalogo dei prodotti della ricerca

The lexical substitution task aims at finding suitable replacements for words in context. It has proved to be useful in several areas, such as word sense induction and text simplification, as well as in more practical applications such as writing-assistant tools. However, the paucity of annotated data has forced researchers to apply mainly unsupervised approaches, limiting the applicability of large pre-trained models and thus hampering the potential benefits of supervised approaches to the task. In this paper, we mitigate this issue by proposing ALaSca, a novel approach to automatically creating large-scale datasets for English lexical substitution. ALaSca allows examples to be produced for potentially any word in a language vocabulary and to cover most of the meanings it lists. Thanks to this, we can unleash the full potential of neural architectures and finetune them on the lexical substitution task. Indeed, when using our data, a transformer-based model performs substantially better than when using manually annotated data only. We release ALaSca at https://sapienzanlp.github.io/alasca/.

ALaSca: an Automated approach for Large-Scale Lexical Substitution / Lacerra, C., Pasini, T., Tripodi, R., Navigli, R.. - In: IJCAI. - ISSN 1045-0823. - (2021), pp. 3836-3842. (30th International Joint Conference on Artificial Intelligence, IJCAI 2021 Online ) [10.24963/ijcai.2021/528].

ALaSca: an Automated approach for Large-Scale Lexical Substitution

Lacerra, Caterina;Pasini, Tommaso;Tripodi, Rocco;Navigli, Roberto

2021

Abstract

The lexical substitution task aims at finding suitable replacements for words in context. It has proved to be useful in several areas, such as word sense induction and text simplification, as well as in more practical applications such as writing-assistant tools. However, the paucity of annotated data has forced researchers to apply mainly unsupervised approaches, limiting the applicability of large pre-trained models and thus hampering the potential benefits of supervised approaches to the task. In this paper, we mitigate this issue by proposing ALaSca, a novel approach to automatically creating large-scale datasets for English lexical substitution. ALaSca allows examples to be produced for potentially any word in a language vocabulary and to cover most of the meanings it lists. Thanks to this, we can unleash the full potential of neural architectures and finetune them on the lexical substitution task. Indeed, when using our data, a transformer-based model performs substantially better than when using manually annotated data only. We release ALaSca at https://sapienzanlp.github.io/alasca/.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2021
			
	Nome convegno
	
				30th International Joint Conference on Artificial Intelligence, IJCAI 2021
			
	Parole chiave
	
				natural language processing; lexical semantics; lexical substitution
			
	Tipologia
	
				04 Pubblicazione in atti di convegno::04c Atto di convegno in rivista
			
	Citazione
	
				ALaSca: an Automated approach for Large-Scale Lexical Substitution / Lacerra, C., Pasini, T., Tripodi, R., Navigli, R.. - In: IJCAI. - ISSN 1045-0823. - (2021), pp. 3836-3842. (30th International Joint Conference on Artificial Intelligence, IJCAI 2021 Online ) [10.24963/ijcai.2021/528].
			
	Appartiene alla tipologia:
	
				04c Atto di convegno in rivista

File allegati a questo prodotto

File	Dimensione	Formato
Lacerra_ALaSca_2021.pdf accesso aperto Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore) Licenza: Creative commons Dimensione 409.9 kB Formato Adobe PDF	409.9 kB	Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1585582

Citazioni

ND

18

12

social impact