A New Integrated Open-source Morphological Analyzer for Hungarian

Attila Novák, Borbála Siklósi, Csaba Oravecz


Abstract
The goal of a Hungarian research project has been to create an integrated Hungarian natural language processing framework. This infrastructure includes tools for analyzing Hungarian texts, integrated into a standardized environment. The morphological analyzer is one of the core components of the framework. The goal of this paper is to describe a fast and customizable morphological analyzer and its development framework, which synthesizes and further enriches the morphological knowledge implemented in previous tools existing for Hungarian. In addition, we present the method we applied to add semantic knowledge to the lexical database of the morphology. The method utilizes neural word embedding models and morphological and shallow syntactic knowledge.
Anthology ID:
L16-1209
Volume:
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
Month:
May
Year:
2016
Address:
Portorož, Slovenia
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
1315–1322
Language:
URL:
https://www.aclweb.org/anthology/L16-1209
DOI:
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/L16-1209.pdf