Tw-StAR at SemEval-2018 Task 1: Preprocessing Impact on Multi-label Emotion Classification

Hala Mulki, Chedi Bechikh Ali, Hatem Haddad, Ismail Babaoğlu


Abstract
In this paper, we describe our contribution in SemEval-2018 contest. We tackled task 1 “Affect in Tweets”, subtask E-c “Detecting Emotions (multi-label classification)”. A multilabel classification system Tw-StAR was developed to recognize the emotions embedded in Arabic, English and Spanish tweets. To handle the multi-label classification problem via traditional classifiers, we employed the binary relevance transformation strategy while a TF-IDF scheme was used to generate the tweets’ features. We investigated using single and combinations of several preprocessing tasks to further improve the performance. The results showed that specific combinations of preprocessing tasks could significantly improve the evaluation measures. This has been later emphasized by the official results as our system ranked 3rd for both Arabic and Spanish datasets and 14th for the English dataset.
Anthology ID:
S18-1024
Volume:
Proceedings of The 12th International Workshop on Semantic Evaluation
Month:
June
Year:
2018
Address:
New Orleans, Louisiana
Venue:
*SEMEVAL
SIGs:
SIGLEX | SIGSEM
Publisher:
Association for Computational Linguistics
Note:
Pages:
167–171
Language:
URL:
https://www.aclweb.org/anthology/S18-1024
DOI:
10.18653/v1/S18-1024
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/S18-1024.pdf