UMDuluth-CS8761 at SemEval-2018 Task 2: Emojis: Too many Choices?

Jonathan Beaulieu, Dennis Asamoah Owusu


Abstract
In this paper, we present our system for assigning an emoji to a tweet based on the text. Each tweet was originally posted with an emoji which the task providers removed. Our task was to decide out of 20 emojis, which originally came with the tweet. Two datasets were provided - one in English and the other in Spanish. We treated the task as a standard classification task with the emojis as our classes and the tweets as our documents. Our best performing system used a Bag of Words model with a Linear Support Vector Machine as its’ classifier. We achieved a macro F1 score of 32.73% for the English data and 17.98% for the Spanish data.
Anthology ID:
S18-1061
Volume:
Proceedings of The 12th International Workshop on Semantic Evaluation
Month:
June
Year:
2018
Address:
New Orleans, Louisiana
Venue:
*SEMEVAL
SIGs:
SIGLEX | SIGSEM
Publisher:
Association for Computational Linguistics
Note:
Pages:
400–404
Language:
URL:
https://www.aclweb.org/anthology/S18-1061
DOI:
10.18653/v1/S18-1061
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/S18-1061.pdf