Target Foresight Based Attention for Neural Machine Translation

Xintong Li, Lemao Liu, Zhaopeng Tu, Shuming Shi, Max Meng


Abstract
In neural machine translation, an attention model is used to identify the aligned source words for a target word (target foresight word) in order to select translation context, but it does not make use of any information of this target foresight word at all. Previous work proposed an approach to improve the attention model by explicitly accessing this target foresight word and demonstrated the substantial gains in alignment task. However, this approach is useless in machine translation task on which the target foresight word is unavailable. In this paper, we propose a new attention model enhanced by the implicit information of target foresight word oriented to both alignment and translation tasks. Empirical experiments on Chinese-to-English and Japanese-to-English datasets show that the proposed attention model delivers significant improvements in terms of both alignment error rate and BLEU.
Anthology ID:
N18-1125
Volume:
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)
Month:
June
Year:
2018
Address:
New Orleans, Louisiana
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1380–1390
Language:
URL:
https://www.aclweb.org/anthology/N18-1125
DOI:
10.18653/v1/N18-1125
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/N18-1125.pdf