Multi-Reward based Reinforcement Learning for Neural Machine Translation

Shuo Sun, Hongxu Hou, Nier Wu, Ziyue Guo, Chaowei Zhang


Abstract
Reinforcement learning (RL) has made remarkable progress in neural machine translation (NMT). However, it exists the problems with uneven sampling distribution, sparse rewards and high variance in training phase. Therefore, we propose a multi-reward reinforcement learning training strategy to decouple action selection and value estimation. Meanwhile, our method combines with language model rewards to jointly optimize model parameters. In addition, we add Gumbel noise in sampling to obtain more effective semantic information. To verify the robustness of our method, we not only conducted experiments on large corpora, but also performed on low-resource languages. Experimental results show that our work is superior to the baselines in WMT14 English-German, LDC2014 Chinese-English and CWMT2018 Mongolian-Chinese tasks, which fully certificates the effectiveness of our method.
Anthology ID:
2020.ccl-1.91
Volume:
Proceedings of the 19th Chinese National Conference on Computational Linguistics
Month:
October
Year:
2020
Address:
Haikou, China
Venue:
CCL
SIG:
Publisher:
Chinese Information Processing Society of China
Note:
Pages:
984–993
Language:
English
URL:
https://www.aclweb.org/anthology/2020.ccl-1.91
DOI:
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/2020.ccl-1.91.pdf