Evaluation and Adaptation of the Celex Dutch Morphological Database

Tom Laureys, Guy De Pauw, Hugo Van hamme, Walter Daelemans, Dirk Van Compernolle


Abstract
This paper describes some important modifications to the Celex morphological database in the context of the FLaVoR project. FLaVoR aims to develop a novel modular framework for speech recognition, enabling the integration of complex linguistic knowledge sources, such as a morphological model. Morphology is a fairly unexploited linguistic information source speech recognizers could benefit from. This is especially true for languages which allow for a rich set of morphological operations, such as our target language Dutch. In this paper we focus on the exploitation of the Celex Dutch morphological database as the information source underlying two different morphological analyzers being developed within the project. Although the Celex database provides a valuable source of morphological information for Dutch, many modifications were necessary before it could be practically applied. We identify major problems, discuss the implemented solutions and finally experimentally evaluate the effect of our modifications to the database.
Anthology ID:
L04-1247
Volume:
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)
Month:
May
Year:
2004
Address:
Lisbon, Portugal
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2004/pdf/421.pdf
DOI:
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://www.lrec-conf.org/proceedings/lrec2004/pdf/421.pdf