ABARUAH at SemEval-2019 Task 5 : Bi-directional LSTM for Hate Speech Detection

Arup Baruah, Ferdous Barbhuiya, Kuntal Dey


Abstract
In this paper, we present the results obtained using bi-directional long short-term memory (BiLSTM) with and without attention and Logistic Regression (LR) models for SemEval-2019 Task 5 titled ”HatEval: Multilingual Detection of Hate Speech Against Immigrants and Women in Twitter”. This paper presents the results obtained for Subtask A for English language. The results of the BiLSTM and LR models are compared for two different types of preprocessing. One with no stemming performed and no stopwords removed. The other with stemming performed and stopwords removed. The BiLSTM model without attention performed the best for the first test, while the LR model with character n-grams performed the best for the second test. The BiLSTM model obtained an F1 score of 0.51 on the test set and obtained an official ranking of 8/71.
Anthology ID:
S19-2065
Volume:
Proceedings of the 13th International Workshop on Semantic Evaluation
Month:
June
Year:
2019
Address:
Minneapolis, Minnesota, USA
Venue:
*SEMEVAL
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
371–376
Language:
URL:
https://www.aclweb.org/anthology/S19-2065
DOI:
10.18653/v1/S19-2065
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/S19-2065.pdf