Byte-based Neural Machine Translation

Marta R. Costa-jussà, Carlos Escolano, José A. R. Fonollosa


Abstract
This paper presents experiments comparing character-based and byte-based neural machine translation systems. The main motivation of the byte-based neural machine translation system is to build multi-lingual neural machine translation systems that can share the same vocabulary. We compare the performance of both systems in several language pairs and we see that the performance in test is similar for most language pairs while the training time is slightly reduced in the case of byte-based neural machine translation.
Anthology ID:
W17-4123
Volume:
Proceedings of the First Workshop on Subword and Character Level Models in NLP
Month:
September
Year:
2017
Address:
Copenhagen, Denmark
Venues:
SCLeM | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
154–158
Language:
URL:
https://www.aclweb.org/anthology/W17-4123
DOI:
10.18653/v1/W17-4123
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/W17-4123.pdf