Reducing Disambiguation Biases in NMT by Leveraging Explicit Word Sense Information

Campolungo, Niccolò; Pasini, Tommaso; Emelin, Denis; Navigli, Roberto

doi:10.18653/v1/2022.naacl-main.355

Recent studies have shed some light on a common pitfall of Neural Machine Translation (NMT) models, stemming from their struggle to disambiguate polysemous words without lapsing into their most frequently occurring senses in the training corpus.In this paper, we first provide a novel approach for automatically creating high-precision sense-annotated parallel corpora, and then put forward a specifically tailored fine-tuning strategy for exploiting these sense annotations during training without introducing any additional requirement at inference time.The use of explicit senses proved to be beneficial to reduce the disambiguation bias of a baseline NMT model, while, at the same time, leading our system to attain higher BLEU scores than its vanilla counterpart in 3 language pairs.

Reducing Disambiguation Biases in NMT by Leveraging Explicit Word Sense Information / Campolungo, Niccolò; Pasini, Tommaso; Emelin, Denis; Navigli, Roberto. - (2022), pp. 4824-4838. (Intervento presentato al convegno 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies tenutosi a Seattle; USA) [10.18653/v1/2022.naacl-main.355].

Reducing Disambiguation Biases in NMT by Leveraging Explicit Word Sense Information

Campolungo, Niccolò^Primo;Pasini, Tommaso^Secondo;Emelin, Denis^Penultimo;Navigli, Roberto^Ultimo

2022

Abstract

Recent studies have shed some light on a common pitfall of Neural Machine Translation (NMT) models, stemming from their struggle to disambiguate polysemous words without lapsing into their most frequently occurring senses in the training corpus.In this paper, we first provide a novel approach for automatically creating high-precision sense-annotated parallel corpora, and then put forward a specifically tailored fine-tuning strategy for exploiting these sense annotations during training without introducing any additional requirement at inference time.The use of explicit senses proved to be beneficial to reduce the disambiguation bias of a baseline NMT model, while, at the same time, leading our system to attain higher BLEU scores than its vanilla counterpart in 3 language pairs.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2022
			
	Nome convegno
	
				2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
			
	Parole chiave
	
				machine translation; disambiguation bias; reducing disambiguation bias; reducing; semantic bias; bias
			
	Tipologia
	
				04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
			
	Citazione
	
				Reducing Disambiguation Biases in NMT by Leveraging Explicit Word Sense Information / Campolungo, Niccolò; Pasini, Tommaso; Emelin, Denis; Navigli, Roberto. - (2022), pp. 4824-4838. (Intervento presentato al  convegno 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies tenutosi a Seattle; USA) [10.18653/v1/2022.naacl-main.355].

File allegati a questo prodotto

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1652971

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

4

1

Catalogo dei prodotti della ricerca