Spectral Graph-Based Method of Multimodal Word Embedding

Kazuki Fukui, Takamasa Oshikiri, Hidetoshi Shimodaira


Abstract
In this paper, we propose a novel method for multimodal word embedding, which exploit a generalized framework of multi-view spectral graph embedding to take into account visual appearances or scenes denoted by words in a corpus. We evaluated our method through word similarity tasks and a concept-to-image search task, having found that it provides word representations that reflect visual information, while somewhat trading-off the performance on the word similarity tasks. Moreover, we demonstrate that our method captures multimodal linguistic regularities, which enable recovering relational similarities between words and images by vector arithmetics.
Anthology ID:
W17-2405
Volume:
Proceedings of TextGraphs-11: the Workshop on Graph-based Methods for Natural Language Processing
Month:
August
Year:
2017
Address:
Vancouver, Canada
Venues:
TextGraphs | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
39–44
Language:
URL:
https://www.aclweb.org/anthology/W17-2405
DOI:
10.18653/v1/W17-2405
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/W17-2405.pdf