Neural Paraphrase Identification of Questions with Noisy Pretraining

Gaurav Singh Tomar, Thyago Duque, Oscar Täckström, Jakob Uszkoreit, Dipanjan Das


Abstract
We present a solution to the problem of paraphrase identification of questions. We focus on a recent dataset of question pairs annotated with binary paraphrase labels and show that a variant of the decomposable attention model (replacing the word embeddings of the decomposable attention model of Parikh et al. 2016 with character n-gram representations) results in accurate performance on this task, while being far simpler than many competing neural architectures. Furthermore, when the model is pretrained on a noisy dataset of automatically collected question paraphrases, it obtains the best reported performance on the dataset.
Anthology ID:
W17-4121
Volume:
Proceedings of the First Workshop on Subword and Character Level Models in NLP
Month:
September
Year:
2017
Address:
Copenhagen, Denmark
Venues:
SCLeM | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
142–147
Language:
URL:
https://www.aclweb.org/anthology/W17-4121
DOI:
10.18653/v1/W17-4121
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/W17-4121.pdf