A Crash Test with Linguistica in Modern Greek: The Case of Derivational Affixes and Bound Stems
Athanasios Karasimos | Evanthia Petropoulou
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)
This paper attempts to participate in the ongoing discussion in search of a suitable model for the computational treatment of Greek morphology. Focusing on the unsupervised morphology learning technique, and particularly on the model of Linguistica by Goldsmith (2001), we attempt a computational treatment of specific word formation phenomena in Modern Greek (MG), such as suffixation and compounding with bound stems, through the use of various corpora. The inability of the system to accept any morphological rule as input, hence the term 'unsupervised', interferes to a great extent with its efficiency in parsing, especially in languages with rich morphology, such as MG, among others. Specifically, neither the rich allomorphy, nor the complex combinability of morphemes in MG appear to be treated efficiently through this technique, resulting in low scores of proper word segmentation (22% in inflectional suffixes and 13% in derivational ones), as well as the recognition of false morphemes.