Connecting Supervised and Unsupervised Sentence Embeddings

Gil Levi


Abstract
Representing sentences as numerical vectors while capturing their semantic context is an important and useful intermediate step in natural language processing. Representations that are both general and discriminative can serve as a tool for tackling various NLP tasks. While common sentence representation methods are unsupervised in nature, recently, an approach for learning universal sentence representation in a supervised setting was presented in (Conneau et al.,2017). We argue that although promising results were obtained, an improvement can be reached by adding various unsupervised constraints that are motivated by auto-encoders and by language models. We show that by adding such constraints, superior sentence embeddings can be achieved. We compare our method with the original implementation and show improvements in several tasks.
Anthology ID:
W18-3010
Volume:
Proceedings of The Third Workshop on Representation Learning for NLP
Month:
July
Year:
2018
Address:
Melbourne, Australia
Venues:
ACL | RepL4NLP | WS
SIG:
SIGREP
Publisher:
Association for Computational Linguistics
Note:
Pages:
79–83
Language:
URL:
https://www.aclweb.org/anthology/W18-3010
DOI:
10.18653/v1/W18-3010
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/W18-3010.pdf