Learning to Embed Words in Context for Syntactic Tasks

Lifu Tu, Kevin Gimpel, Karen Livescu


Abstract
We present models for embedding words in the context of surrounding words. Such models, which we refer to as token embeddings, represent the characteristics of a word that are specific to a given context, such as word sense, syntactic category, and semantic role. We explore simple, efficient token embedding models based on standard neural network architectures. We learn token embeddings on a large amount of unannotated text and evaluate them as features for part-of-speech taggers and dependency parsers trained on much smaller amounts of annotated data. We find that predictors endowed with token embeddings consistently outperform baseline predictors across a range of context window and training set sizes.
Anthology ID:
W17-2632
Volume:
Proceedings of the 2nd Workshop on Representation Learning for NLP
Month:
August
Year:
2017
Address:
Vancouver, Canada
Venues:
RepL4NLP | WS
SIG:
SIGREP
Publisher:
Association for Computational Linguistics
Note:
Pages:
265–275
Language:
URL:
https://www.aclweb.org/anthology/W17-2632
DOI:
10.18653/v1/W17-2632
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/W17-2632.pdf