Karin Hedberg


pdf bib
A Multi-domain Corpus of Swedish Word Sense Annotation
Richard Johansson | Yvonne Adesam | Gerlof Bouma | Karin Hedberg
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

We describe the word sense annotation layer in Eukalyptus, a freely available five-domain corpus of contemporary Swedish with several annotation layers. The annotation uses the SALDO lexicon to define the sense inventory, and allows word sense annotation of compound segments and multiword units. We give an overview of the new annotation tool developed for this project, and finally present an analysis of the inter-annotator agreement between two annotators.