On the Complexity and Typology of Inflectional Morphological Systems

Ryan Cotterell, Christo Kirov, Mans Hulden, Jason Eisner


Abstract
We quantify the linguistic complexity of different languages’ morphological systems. We verify that there is a statistically significant empirical trade-off between paradigm size and irregularity: A language’s inflectional paradigms may be either large in size or highly irregular, but never both. We define a new measure of paradigm irregularity based on the conditional entropy of the surface realization of a paradigm— how hard it is to jointly predict all the word forms in a paradigm from the lemma. We estimate irregularity by training a predictive model. Our measurements are taken on large morphological paradigms from 36 typologically diverse languages.
Anthology ID:
Q19-1021
Volume:
Transactions of the Association for Computational Linguistics, Volume 7
Month:
March
Year:
2019
Address:
Venue:
TACL
SIG:
Publisher:
Note:
Pages:
327–342
Language:
URL:
https://www.aclweb.org/anthology/Q19-1021
DOI:
10.1162/tacl_a_00271
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/Q19-1021.pdf