In this paper we present the University of Helsinki submissions to the WMT 2019 shared news translation task in three language pairs: English-German, English-Finnish and Finnish-English. This year we focused first on cleaning and filtering the training data using multiple data-filtering approaches, resulting in much smaller and cleaner training sets. For English-German we trained both sentence-level transformer models as well as compared different document-level translation approaches. For Finnish-English and English-Finnish we focused on different segmentation approaches and we also included a rule-based system for English-Finnish.

The University of Helsinki Submissions to the WMT19 News Translation Task / Talman, Aarne; Sulubacak, Umut; Vázquez, Raúl; Scherrer, Yves; Virpioja, Sami; Raganato, Alessandro; Hurskainen, Arvi; Tiedemann, Jörg. - (2019), pp. 412-423. (Intervento presentato al convegno Fourth Conference on Machine Translation tenutosi a Florence; Italy) [10.18653/v1/W19-5347].

The University of Helsinki Submissions to the WMT19 News Translation Task

Raganato, Alessandro;
2019

Abstract

In this paper we present the University of Helsinki submissions to the WMT 2019 shared news translation task in three language pairs: English-German, English-Finnish and Finnish-English. This year we focused first on cleaning and filtering the training data using multiple data-filtering approaches, resulting in much smaller and cleaner training sets. For English-German we trained both sentence-level transformer models as well as compared different document-level translation approaches. For Finnish-English and English-Finnish we focused on different segmentation approaches and we also included a rule-based system for English-Finnish.
2019
Fourth Conference on Machine Translation
transformer; machine translation; data filtering
04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
The University of Helsinki Submissions to the WMT19 News Translation Task / Talman, Aarne; Sulubacak, Umut; Vázquez, Raúl; Scherrer, Yves; Virpioja, Sami; Raganato, Alessandro; Hurskainen, Arvi; Tiedemann, Jörg. - (2019), pp. 412-423. (Intervento presentato al convegno Fourth Conference on Machine Translation tenutosi a Florence; Italy) [10.18653/v1/W19-5347].
File allegati a questo prodotto
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1553743
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 6
  • ???jsp.display-item.citation.isi??? 1
social impact