Preferred Answer Selection in Stack Overflow: Better Text Representations ... and Metadata, Metadata, Metadata

Steven Xu, Andrew Bennett, Doris Hoogeveen, Jey Han Lau, Timothy Baldwin


Abstract
Community question answering (cQA) forums provide a rich source of data for facilitating non-factoid question answering over many technical domains. Given this, there is considerable interest in answer retrieval from these kinds of forums. However this is a difficult task as the structure of these forums is very rich, and both metadata and text features are important for successful retrieval. While there has recently been a lot of work on solving this problem using deep learning models applied to question/answer text, this work has not looked at how to make use of the rich metadata available in cQA forums. We propose an attention-based model which achieves state-of-the-art results for text-based answer selection alone, and by making use of complementary meta-data, achieves a substantially higher result over two reference datasets novel to this work.
Anthology ID:
W18-6119
Volume:
Proceedings of the 2018 EMNLP Workshop W-NUT: The 4th Workshop on Noisy User-generated Text
Month:
November
Year:
2018
Address:
Brussels, Belgium
Venues:
EMNLP | WNUT | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
137–147
Language:
URL:
https://www.aclweb.org/anthology/W18-6119
DOI:
10.18653/v1/W18-6119
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/W18-6119.pdf