Catalina Maranduc


Adding a Syntactic Annotation Level to the Corpus of Contemporary Romanian Language
Andrei Scutelnicu | Catalina Maranduc | Dan Cristea
Proceedings of the 8th Workshop on Challenges in the Management of Large Corpora

In this paper we present an experiment of augmenting the Corpus of Contemporary Romanian Language (CoRoLa) with the syntactic level of annotations, which would allow users to address queries about the syntax of Romanian sentences, in the Universal Dependency model. After a short introduction of CoRoLa, we describe the treebanks used to train the dependency parser, we show the evaluation results and the process of upgrading CoRoLa with the new level of annotations. The parser displaying the best accuracy with respect to recognition of heads and relations, out of three variants trained on manually built treebanks, was chosen. Keywords: Syntactic annotation, treebank, corpus, maltparser