Label Propagation-Based Semi-Supervised Learning for Hate Speech Classification

Ashwin Geet D’Sa, Irina Illina, Dominique Fohr, Dietrich Klakow, Dana Ruiter


Abstract
Research on hate speech classification has received increased attention. In real-life scenarios, a small amount of labeled hate speech data is available to train a reliable classifier. Semi-supervised learning takes advantage of a small amount of labeled data and a large amount of unlabeled data. In this paper, label propagation-based semi-supervised learning is explored for the task of hate speech classification. The quality of labeling the unlabeled set depends on the input representations. In this work, we show that pre-trained representations are label agnostic, and when used with label propagation yield poor results. Neural network-based fine-tuning can be adopted to learn task-specific representations using a small amount of labeled data. We show that fully fine-tuned representations may not always be the best representations for the label propagation and intermediate representations may perform better in a semi-supervised setup.
Anthology ID:
2020.insights-1.8
Volume:
Proceedings of the First Workshop on Insights from Negative Results in NLP
Month:
November
Year:
2020
Address:
Online
Venues:
EMNLP | insights
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
54–59
Language:
URL:
https://www.aclweb.org/anthology/2020.insights-1.8
DOI:
10.18653/v1/2020.insights-1.8
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/2020.insights-1.8.pdf