Extracting Social Networks from Literary Text with Word Embedding Tools

Gerhard Wohlgenannt, Ekaterina Chernyak, Dmitry Ilvovsky


Abstract
In this paper a social network is extracted from a literary text. The social network shows, how frequent the characters interact and how similar their social behavior is. Two types of similarity measures are used: the first applies co-occurrence statistics, while the second exploits cosine similarity on different types of word embedding vectors. The results are evaluated by a paid micro-task crowdsourcing survey. The experiments suggest that specific types of word embeddings like word2vec are well-suited for the task at hand and the specific circumstances of literary fiction text.
Anthology ID:
W16-4004
Volume:
Proceedings of the Workshop on Language Technology Resources and Tools for Digital Humanities (LT4DH)
Month:
December
Year:
2016
Address:
Osaka, Japan
Venues:
LT4DH | WS
SIG:
Publisher:
The COLING 2016 Organizing Committee
Note:
Pages:
18–25
Language:
URL:
https://www.aclweb.org/anthology/W16-4004
DOI:
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/W16-4004.pdf