One Size Fits All? A simple LSTM for non-literal token and construction-level classification

Erik-Lân Do Dinh, Steffen Eger, Iryna Gurevych


Abstract
In this paper, we tackle four different tasks of non-literal language classification: token and construction level metaphor detection, classification of idiomatic use of infinitive-verb compounds, and classification of non-literal particle verbs. One of the tasks operates on the token level, while the three other tasks classify constructions such as “hot topic” or “stehen lassen” (“to allow sth. to stand” vs. “to abandon so.”). The two metaphor detection tasks are in English, while the two non-literal language detection tasks are in German. We propose a simple context-encoding LSTM model and show that it outperforms the state-of-the-art on two tasks. Additionally, we experiment with different embeddings for the token level metaphor detection task and find that 1) their performance varies according to the genre, and 2) word2vec embeddings perform best on 3 out of 4 genres, despite being one of the simplest tested model. In summary, we present a large-scale analysis of a neural model for non-literal language classification (i) at different granularities, (ii) in different languages, (iii) over different non-literal language phenomena.
Anthology ID:
W18-4508
Volume:
Proceedings of the Second Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature
Month:
August
Year:
2018
Address:
Santa Fe, New Mexico
Venues:
COLING | LaTeCH | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
70–80
Language:
URL:
https://www.aclweb.org/anthology/W18-4508
DOI:
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/W18-4508.pdf