Variation in Universal Dependencies annotation: A token-based typological case study on adpossessive constructions

Kaius Sinnemäki, Viljami Haakana


Abstract
In this paper we present a method for identifying and analyzing adnominal possessive constructions in 66 Universal Dependencies treebanks. We classify adpossessive constructions in terms of their morphological type (locus of marking) and present a workflow for detecting and analyzing them typologically. Based on a preliminary evaluation, the algorithm works fairly reliably in adpossessive constructions that are morphologically marked. However, it performs rather poorly in adpossessive constructions that are not marked morphologically, so-called zero-marked constructions, because of difficulties in identifying these constructions with the current annotation. We also discuss different types of variation in annotation in different treebanks for the same language and for treebanks of closely related languages. The research focuses on one well-circumscribed and universal construction in the hope of generating more interest in using UD for cross-linguistic comparison and for contributing towards developing yet more consistent annotation of constructions in the UD annotation scheme.
Anthology ID:
2020.udw-1.18
Volume:
Proceedings of the Fourth Workshop on Universal Dependencies (UDW 2020)
Month:
December
Year:
2020
Address:
Barcelona, Spain (Online)
Venues:
COLING | UDW
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
158–167
Language:
URL:
https://www.aclweb.org/anthology/2020.udw-1.18
DOI:
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/2020.udw-1.18.pdf