Crowd-sourcing annotation of complex NLU tasks: A case study of argumentative content annotation

Tamar Lavee, Lili Kotlerman, Matan Orbach, Yonatan Bilu, Michal Jacovi, Ranit Aharonov, Noam Slonim


Abstract
Recent advancements in machine reading and listening comprehension involve the annotation of long texts. Such tasks are typically time consuming, making crowd-annotations an attractive solution, yet their complexity often makes such a solution unfeasible. In particular, a major concern is that crowd annotators may be tempted to skim through long texts, and answer questions without reading thoroughly. We present a case study of adapting this type of task to the crowd. The task is to identify claims in a several minute long debate speech. We show that sentence-by-sentence annotation does not scale and that labeling only a subset of sentences is insufficient. Instead, we propose a scheme for effectively performing the full, complex task with crowd annotators, allowing the collection of large scale annotated datasets. We believe that the encountered challenges and pitfalls, as well as lessons learned, are relevant in general when collecting data for large scale natural language understanding (NLU) tasks.
Anthology ID:
D19-5905
Volume:
Proceedings of the First Workshop on Aggregating and Analysing Crowdsourced Annotations for NLP
Month:
November
Year:
2019
Address:
Hong Kong
Venues:
EMNLP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
29–38
Language:
URL:
https://www.aclweb.org/anthology/D19-5905
DOI:
10.18653/v1/D19-5905
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/D19-5905.pdf
Attachment:
 D19-5905.Attachment.pdf