Automatically Tailoring Unsupervised Morphological Segmentation to the Language

Ramy Eskander, Owen Rambow, Smaranda Muresan


Abstract
Morphological segmentation is beneficial for several natural language processing tasks dealing with large vocabularies. Unsupervised methods for morphological segmentation are essential for handling a diverse set of languages, including low-resource languages. Eskander et al. (2016) introduced a Language Independent Morphological Segmenter (LIMS) using Adaptor Grammars (AG) based on the best-on-average performing AG configuration. However, while LIMS worked best on average and outperforms other state-of-the-art unsupervised morphological segmentation approaches, it did not provide the optimal AG configuration for five out of the six languages. We propose two language-independent classifiers that enable the selection of the optimal or nearly-optimal configuration for the morphological segmentation of unseen languages.
Anthology ID:
W18-5808
Volume:
Proceedings of the Fifteenth Workshop on Computational Research in Phonetics, Phonology, and Morphology
Month:
October
Year:
2018
Address:
Brussels, Belgium
Venues:
EMNLP | WS
SIG:
SIGMORPHON
Publisher:
Association for Computational Linguistics
Note:
Pages:
78–83
Language:
URL:
https://www.aclweb.org/anthology/W18-5808
DOI:
10.18653/v1/W18-5808
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/W18-5808.pdf