Using Coreference Links to Improve Spanish-to-English Machine Translation

Lesly Miculicich Werlen, Andrei Popescu-Belis


Abstract
In this paper, we present a proof-of-concept implementation of a coreference-aware decoder for document-level machine translation. We consider that better translations should have coreference links that are closer to those in the source text, and implement this criterion in two ways. First, we define a similarity measure between source and target coreference structures, by projecting the target ones onto the source and reusing existing coreference metrics. Based on this similarity measure, we re-rank the translation hypotheses of a baseline system for each sentence. Alternatively, to address the lack of diversity of mentions in the MT hypotheses, we focus on mention pairs and integrate their coreference scores with MT ones, resulting in post-editing decisions for mentions. The experimental results for Spanish to English MT on the AnCora-ES corpus show that the second approach yields a substantial increase in the accuracy of pronoun translation, with BLEU scores remaining constant.
Anthology ID:
W17-1505
Volume:
Proceedings of the 2nd Workshop on Coreference Resolution Beyond OntoNotes (CORBON 2017)
Month:
April
Year:
2017
Address:
Valencia, Spain
Venues:
CORBON | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
30–40
Language:
URL:
https://www.aclweb.org/anthology/W17-1505
DOI:
10.18653/v1/W17-1505
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/W17-1505.pdf