In this paper we present the University of Helsinki submissions to the WMT 2019 shared news translation task in three language pairs: English-German, English-Finnish and Finnish-English. This year we focused first on cleaning and filtering the training data using multiple data-filtering approaches, resulting in much smaller and cleaner training sets. For English-German we trained both sentence-level transformer models as well as compared different document-level translation approaches. For Finnish-English and English-Finnish we focused on different segmentation approaches and we also included a rule-based system for English-Finnish.
The University of Helsinki Submissions to the WMT19 News Translation Task / Talman, Aarne; Sulubacak, Umut; Vázquez, Raúl; Scherrer, Yves; Virpioja, Sami; Raganato, Alessandro; Hurskainen, Arvi; Tiedemann, Jörg. - (2019), pp. 412-423. (Intervento presentato al convegno Fourth Conference on Machine Translation tenutosi a Florence; Italy) [10.18653/v1/W19-5347].
The University of Helsinki Submissions to the WMT19 News Translation Task
Raganato, Alessandro;
2019
Abstract
In this paper we present the University of Helsinki submissions to the WMT 2019 shared news translation task in three language pairs: English-German, English-Finnish and Finnish-English. This year we focused first on cleaning and filtering the training data using multiple data-filtering approaches, resulting in much smaller and cleaner training sets. For English-German we trained both sentence-level transformer models as well as compared different document-level translation approaches. For Finnish-English and English-Finnish we focused on different segmentation approaches and we also included a rule-based system for English-Finnish.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.