Multilingual Projection for Parsing Truly Low-Resource Languages

Željko Agić, Anders Johannsen, Barbara Plank, Héctor Martínez Alonso, Natalie Schluter, Anders Søgaard


Abstract
We propose a novel approach to cross-lingual part-of-speech tagging and dependency parsing for truly low-resource languages. Our annotation projection-based approach yields tagging and parsing models for over 100 languages. All that is needed are freely available parallel texts, and taggers and parsers for resource-rich languages. The empirical evaluation across 30 test languages shows that our method consistently provides top-level accuracies, close to established upper bounds, and outperforms several competitive baselines.
Anthology ID:
Q16-1022
Volume:
Transactions of the Association for Computational Linguistics, Volume 4
Month:
Year:
2016
Address:
Venue:
TACL
SIG:
Publisher:
Note:
Pages:
301–312
Language:
URL:
https://www.aclweb.org/anthology/Q16-1022
DOI:
10.1162/tacl_a_00100
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/Q16-1022.pdf