QU-BIGIR at SemEval 2017 Task 3: Using Similarity Features for Arabic Community Question Answering Forums

Marwan Torki, Maram Hasanain, Tamer Elsayed


Abstract
In this paper we describe our QU-BIGIR system for the Arabic subtask D of the SemEval 2017 Task 3. Our approach builds on our participation in the past version of the same subtask. This year, our system uses different similarity measures that encodes lexical and semantic pairwise similarity of text pairs. In addition to well known similarity measures such as cosine similarity, we use other measures based on the summary statistics of word embedding representation for a given text. To rank a list of candidate question answer pairs for a given question, we learn a linear SVM classifier over our similarity features. Our best resulting run came second in subtask D with a very competitive performance to the first-ranking system.
Anthology ID:
S17-2059
Volume:
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)
Month:
August
Year:
2017
Address:
Vancouver, Canada
Venue:
*SEMEVAL
SIGs:
SIGLEX | SIGSEM
Publisher:
Association for Computational Linguistics
Note:
Pages:
360–364
Language:
URL:
https://www.aclweb.org/anthology/S17-2059
DOI:
10.18653/v1/S17-2059
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/S17-2059.pdf