Cross-lingual Visual Verb Sense Disambiguation

Spandana Gella, Desmond Elliott, Frank Keller


Abstract
Recent work has shown that visual context improves cross-lingual sense disambiguation for nouns. We extend this line of work to the more challenging task of cross-lingual verb sense disambiguation, introducing the MultiSense dataset of 9,504 images annotated with English, German, and Spanish verbs. Each image in MultiSense is annotated with an English verb and its translation in German or Spanish. We show that cross-lingual verb sense disambiguation models benefit from visual context, compared to unimodal baselines. We also show that the verb sense predicted by our best disambiguation model can improve the results of a text-only machine translation system when used for a multimodal translation task.
Anthology ID:
N19-1200
Volume:
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)
Month:
June
Year:
2019
Address:
Minneapolis, Minnesota
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1998–2004
Language:
URL:
https://www.aclweb.org/anthology/N19-1200
DOI:
10.18653/v1/N19-1200
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/N19-1200.pdf
Video:
 https://vimeo.com/354228781