Predicting Suicide Risk from Online Postings in Reddit The UGent-IDLab submission to the CLPysch 2019 Shared Task A

Semere Kiros Bitew, Giannis Bekoulis, Johannes Deleu, Lucas Sterckx, Klim Zaporojets, Thomas Demeester, Chris Develder


Abstract
This paper describes IDLab’s text classification systems submitted to Task A as part of the CLPsych 2019 shared task. The aim of this shared task was to develop automated systems that predict the degree of suicide risk of people based on their posts on Reddit. Bag-of-words features, emotion features and post level predictions are used to derive user-level predictions. Linear models and ensembles of these models are used to predict final scores. We find that predicting fine-grained risk levels is much more difficult than flagging potentially at-risk users. Furthermore, we do not find clear added value from building richer ensembles compared to simple baselines, given the available training data and the nature of the prediction task.
Anthology ID:
W19-3019
Volume:
Proceedings of the Sixth Workshop on Computational Linguistics and Clinical Psychology
Month:
June
Year:
2019
Address:
Minneapolis, Minnesota
Venues:
CLPsych | NAACL | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
158–161
Language:
URL:
https://www.aclweb.org/anthology/W19-3019
DOI:
10.18653/v1/W19-3019
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/W19-3019.pdf