Categorizing Offensive Language in Social Networks: A Chinese Corpus, Systems and an Explainable Tool

Xiangru Tang, Xianjun Shen


Abstract
Recently, more and more data have been generated in the online world, filled with offensive language such as threats, swear words or straightforward insults. It is disgraceful for a progressive society, and then the question arises on how language resources and technologies can cope with this challenge. However, previous work only analyzes the problem as a whole but fails to detect particular types of offensive content in a more fine-grained way, mainly because of the lack of annotated data. In this work, we present a densely annotated data-set COLA
Anthology ID:
2020.ccl-1.97
Volume:
Proceedings of the 19th Chinese National Conference on Computational Linguistics
Month:
October
Year:
2020
Address:
Haikou, China
Venue:
CCL
SIG:
Publisher:
Chinese Information Processing Society of China
Note:
Pages:
1045–1056
Language:
English
URL:
https://www.aclweb.org/anthology/2020.ccl-1.97
DOI:
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/2020.ccl-1.97.pdf