Robust parfda Statistical Machine Translation Results

Ergun Biçici


Abstract
We build parallel feature decay algorithms (parfda) Moses statistical machine translation (SMT) models for language pairs in the translation task. parfda obtains results close to the top constrained phrase-based SMT with an average of 2.252 BLEU points difference on WMT 2017 datasets using significantly less computation for building SMT systems than that would be spent using all available corpora. We obtain BLEU upper bounds based on target coverage to identify which systems used additional data. We use PRO for tuning to decrease fluctuations in the results and postprocess translation outputs to decrease translation errors due to the casing of words. F1 scores on the key phrases of the English to Turkish testsuite that we prepared reveal that parfda achieves 2nd best results. Truecasing translations before scoring obtained the best results overall.
Anthology ID:
W18-6405
Volume:
Proceedings of the Third Conference on Machine Translation: Shared Task Papers
Month:
October
Year:
2018
Address:
Belgium, Brussels
Venues:
EMNLP | WMT | WS
SIG:
SIGMT
Publisher:
Association for Computational Linguistics
Note:
Pages:
345–354
Language:
URL:
https://www.aclweb.org/anthology/W18-6405
DOI:
10.18653/v1/W18-6405
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/W18-6405.pdf