Muhamed Al Khalil


2020

pdf bib
A Large-Scale Leveled Readability Lexicon for Standard Arabic
Muhamed Al Khalil | Nizar Habash | Zhengyang Jiang
Proceedings of the 12th Language Resources and Evaluation Conference

We present a large-scale 26,000-lemma leveled readability lexicon for Modern Standard Arabic. The lexicon was manually annotated in triplicate by language professionals from three regions in the Arab world. The annotations show a high degree of agreement; and major differences were limited to regional variations. Comparing lemma readability levels with their frequencies provided good insights in the benefits and pitfalls of frequency-based readability approaches. The lexicon will be publicly available.

pdf bib
An Online Readability Leveled Arabic Thesaurus
Zhengyang Jiang | Nizar Habash | Muhamed Al Khalil
Proceedings of the 28th International Conference on Computational Linguistics: System Demonstrations

This demo paper introduces the online Readability Leveled Arabic Thesaurus interface. For a given user input word, this interface provides the word’s possible lemmas, roots, English glosses, related Arabic words and phrases, and readability on a five-level readability scale. This interface builds on and connects multiple existing Arabic resources and processing tools. This one-of-a-kind system enables Arabic speakers and learners to benefit from advances in Arabic computational linguistics technologies. Feedback from users of the system will help the developers to identify lexical coverage gaps and errors. A live link to the demo is available at: http://samer.camel-lab.com/.

2018

pdf bib
A Leveled Reading Corpus of Modern Standard Arabic
Muhamed Al Khalil | Hind Saddiki | Nizar Habash | Latifa Alfalasi
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

pdf bib
Feature Optimization for Predicting Readability of Arabic L1 and L2
Hind Saddiki | Nizar Habash | Violetta Cavalli-Sforza | Muhamed Al Khalil
Proceedings of the 5th Workshop on Natural Language Processing Techniques for Educational Applications

Advances in automatic readability assessment can impact the way people consume information in a number of domains. Arabic, being a low-resource and morphologically complex language, presents numerous challenges to the task of automatic readability assessment. In this paper, we present the largest and most in-depth computational readability study for Arabic to date. We study a large set of features with varying depths, from shallow words to syntactic trees, for both L1 and L2 readability tasks. Our best L1 readability accuracy result is 94.8% (75% error reduction from a commonly used baseline). The comparable results for L2 are 72.4% (45% error reduction). We also demonstrate the added value of leveraging L1 features for L2 readability prediction.