Despite the remarkable progress made in the field of Machine Translation (MT), current systems still struggle when translating ambiguous words, especially when these express infrequent meanings. In order to investigate and analyze the impact of lexical ambiguity on automatic translations, several tasks and evaluation benchmarks have been proposed over the course of the last few years. However, works in this research direction suffer from critical shortcomings. Indeed, existing evaluation datasets are not entirely manually curated, which significantly compromises their reliability. Furthermore, current literature fails to provide detailed insights into the nature of the errors produced by models translating ambiguous words, lacking a thorough manual analysis across languages. With a view to overcoming these limitations, we propose Disambiguation Biases in MT (DiBiMT), an entirely manually-curated evaluation benchmark for investigating disambiguation biases in eight language combinations and assessing the ability of both commercial and non-commercial systems to handle ambiguous words. We also examine and detail the errors produced by models in this scenario by carrying out a manual error analysis in all language pairs. Additionally, we perform an extensive array of experiments aimed at studying the behavior of models when dealing with ambiguous words. Finally, we show the ineffectiveness of standard MT evaluation settings for assessing the disambiguation capabilities of systems and highlight the need for additional efforts in this research direction and ad-hoc testbeds such as DiBiMT. Our benchmark is available at: https://nlp.uniroma1.it/dibimt/.

DIBIMT: A Gold Evaluation Benchmark for Studying Lexical Ambiguity in Machine Translation / Martelli, Federico; Perrella, Stefano; Campolungo, Niccolò; Munda, Tina; Koeva, Svetla; Tiberius, Carole; Navigli, Roberto. - In: COMPUTATIONAL LINGUISTICS. - ISSN 1530-9312. - 1:1(2024). [10.1162/coli_a_00541]

DIBIMT: A Gold Evaluation Benchmark for Studying Lexical Ambiguity in Machine Translation

Federico Martelli
Co-primo
;
Stefano Perrella
Co-primo
;
Niccolò Campolungo
Co-primo
;
Roberto Navigli
Co-primo
2024

Abstract

Despite the remarkable progress made in the field of Machine Translation (MT), current systems still struggle when translating ambiguous words, especially when these express infrequent meanings. In order to investigate and analyze the impact of lexical ambiguity on automatic translations, several tasks and evaluation benchmarks have been proposed over the course of the last few years. However, works in this research direction suffer from critical shortcomings. Indeed, existing evaluation datasets are not entirely manually curated, which significantly compromises their reliability. Furthermore, current literature fails to provide detailed insights into the nature of the errors produced by models translating ambiguous words, lacking a thorough manual analysis across languages. With a view to overcoming these limitations, we propose Disambiguation Biases in MT (DiBiMT), an entirely manually-curated evaluation benchmark for investigating disambiguation biases in eight language combinations and assessing the ability of both commercial and non-commercial systems to handle ambiguous words. We also examine and detail the errors produced by models in this scenario by carrying out a manual error analysis in all language pairs. Additionally, we perform an extensive array of experiments aimed at studying the behavior of models when dealing with ambiguous words. Finally, we show the ineffectiveness of standard MT evaluation settings for assessing the disambiguation capabilities of systems and highlight the need for additional efforts in this research direction and ad-hoc testbeds such as DiBiMT. Our benchmark is available at: https://nlp.uniroma1.it/dibimt/.
2024
Machine Translation; Word Sense Disambiguation; Semantic Biases in Machine Translation
01 Pubblicazione su rivista::01a Articolo in rivista
DIBIMT: A Gold Evaluation Benchmark for Studying Lexical Ambiguity in Machine Translation / Martelli, Federico; Perrella, Stefano; Campolungo, Niccolò; Munda, Tina; Koeva, Svetla; Tiberius, Carole; Navigli, Roberto. - In: COMPUTATIONAL LINGUISTICS. - ISSN 1530-9312. - 1:1(2024). [10.1162/coli_a_00541]
File allegati a questo prodotto
File Dimensione Formato  
Martelli_DiBiMT_2024.pdf

accesso aperto

Note: https://direct.mit.edu/coli/article-pdf/doi/10.1162/coli_a_00541/2474106/coli_a_00541.pdf
Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Creative commons
Dimensione 1.63 MB
Formato Adobe PDF
1.63 MB Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1722083
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact