A Hungarian Sentiment Corpus Manually Annotated at Aspect Level

Martina Katalin Szabó, Veronika Vincze, Katalin Ilona Simkó, Viktor Varga, Viktor Hangya


Abstract
In this paper we present a Hungarian sentiment corpus manually annotated at aspect level. Our corpus consists of Hungarian opinion texts written about different types of products. The main aim of creating the corpus was to produce an appropriate database providing possibilities for developing text mining software tools. The corpus is a unique Hungarian database: to the best of our knowledge, no digitized Hungarian sentiment corpus that is annotated on the level of fragments and targets has been made so far. In addition, many language elements of the corpus, relevant from the point of view of sentiment analysis, got distinct types of tags in the annotation. In this paper, on the one hand, we present the method of annotation, and we discuss the difficulties concerning text annotation process. On the other hand, we provide some quantitative and qualitative data on the corpus. We conclude with a description of the applicability of the corpus.
Anthology ID:
L16-1459
Volume:
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
Month:
May
Year:
2016
Address:
Portorož, Slovenia
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
2873–2878
Language:
URL:
https://www.aclweb.org/anthology/L16-1459
DOI:
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/L16-1459.pdf