Marian: Cost-effective High-Quality Neural Machine Translation in C++

Marcin Junczys-Dowmunt, Kenneth Heafield, Hieu Hoang, Roman Grundkiewicz, Anthony Aue


Abstract
This paper describes the submissions of the “Marian” team to the WNMT 2018 shared task. We investigate combinations of teacher-student training, low-precision matrix products, auto-tuning and other methods to optimize the Transformer model on GPU and CPU. By further integrating these methods with the new averaging attention networks, a recently introduced faster Transformer variant, we create a number of high-quality, high-performance models on the GPU and CPU, dominating the Pareto frontier for this shared task.
Anthology ID:
W18-2716
Volume:
Proceedings of the 2nd Workshop on Neural Machine Translation and Generation
Month:
July
Year:
2018
Address:
Melbourne, Australia
Venues:
ACL | NGT | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
129–135
Language:
URL:
https://www.aclweb.org/anthology/W18-2716
DOI:
10.18653/v1/W18-2716
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/W18-2716.pdf