The SMarT Classifier for Arabic Fine-Grained Dialect Identification

Karima Meftouh, Karima Abidi, Salima Harrat, Kamel Smaili


Abstract
This paper describes the approach adopted by the SMarT research group to build a dialect identification system in the framework of the Madar shared task on Arabic fine-grained dialect identification. We experimented several approaches, but we finally decided to use a Multinomial Naive Bayes classifier based on word and character ngrams in addition to the language model probabilities. We achieved a score of 67.73% in terms of Macro accuracy and a macro-averaged F1-score of 67.31%
Anthology ID:
W19-4633
Volume:
Proceedings of the Fourth Arabic Natural Language Processing Workshop
Month:
August
Year:
2019
Address:
Florence, Italy
Venues:
ACL | WANLP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
259–263
Language:
URL:
https://www.aclweb.org/anthology/W19-4633
DOI:
10.18653/v1/W19-4633
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/W19-4633.pdf