A Preliminary Study of Croatian Lexical Substitution

Domagoj Alagić, Jan Šnajder


Abstract
Lexical substitution is a task of determining a meaning-preserving replacement for a word in context. We report on a preliminary study of this task for the Croatian language on a small-scale lexical sample dataset, manually annotated using three different annotation schemes. We compare the annotations, analyze the inter-annotator agreement, and observe a number of interesting language specific details in the obtained lexical substitutes. Furthermore, we apply a recently-proposed, dependency-based lexical substitution model to our dataset. The model achieves a P@3 score of 0.35, which indicates the difficulty of the task.
Anthology ID:
W17-1403
Volume:
Proceedings of the 6th Workshop on Balto-Slavic Natural Language Processing
Month:
April
Year:
2017
Address:
Valencia, Spain
Venues:
BSNLP | WS
SIG:
SIGSLAV
Publisher:
Association for Computational Linguistics
Note:
Pages:
14–19
Language:
URL:
https://www.aclweb.org/anthology/W17-1403
DOI:
10.18653/v1/W17-1403
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/W17-1403.pdf