PAWS: A Multi-lingual Parallel Treebank with Anaphoric Relations

Anna Nedoluzhko, Michal Novák, Maciej Ogrodniczuk


Abstract
We present PAWS, a multi-lingual parallel treebank with coreference annotation. It consists of English texts from the Wall Street Journal translated into Czech, Russian and Polish. In addition, the texts are syntactically parsed and word-aligned. PAWS is based on PCEDT 2.0 and continues the tradition of multilingual treebanks with coreference annotation. The paper focuses on the coreference annotation in PAWS and its language-specific differences. PAWS offers linguistic material that can be further leveraged in cross-lingual studies, especially on coreference.
Anthology ID:
W18-0708
Volume:
Proceedings of the First Workshop on Computational Models of Reference, Anaphora and Coreference
Month:
June
Year:
2018
Address:
New Orleans, Louisiana
Venues:
CRAC | NAACL | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
68–76
Language:
URL:
https://www.aclweb.org/anthology/W18-0708
DOI:
10.18653/v1/W18-0708
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/W18-0708.pdf