CUNI Submissions in WMT18
Tom Kocmi, Roman Sudarikov, Ondřej Bojar
Abstract
We participated in the WMT 2018 shared news translation task in three language pairs: English-Estonian, English-Finnish, and English-Czech. Our main focus was the low-resource language pair of Estonian and English for which we utilized Finnish parallel data in a simple method. We first train a “parent model” for the high-resource language pair followed by adaptation on the related low-resource language pair. This approach brings a substantial performance boost over the baseline system trained only on Estonian-English parallel data. Our systems are based on the Transformer architecture. For the English to Czech translation, we have evaluated our last year models of hybrid phrase-based approach and neural machine translation mainly for comparison purposes.- Anthology ID:
- W18-6416
- Volume:
- Proceedings of the Third Conference on Machine Translation: Shared Task Papers
- Month:
- October
- Year:
- 2018
- Address:
- Belgium, Brussels
- Venues:
- EMNLP | WMT | WS
- SIG:
- SIGMT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 431–437
- Language:
- URL:
- https://www.aclweb.org/anthology/W18-6416
- DOI:
- 10.18653/v1/W18-6416
- PDF:
- http://aclanthology.lst.uni-saarland.de/W18-6416.pdf