A Character Level Convolutional BiLSTM for Arabic Dialect Identification

Mohamed Elaraby, Ahmed Zahran


Abstract
In this paper, we describe CU-RAISA teamcontribution to the 2019Madar shared task2, which focused on Twitter User fine-grained dialect identification.Among par-ticipating teams, our system ranked the4th(with 61.54%) F1-Macro measure.Our sys-tem is trained using a character level convo-lutional bidirectional long-short-term memorynetwork trained on 2k users’ data. We showthat training on concatenated user tweets asinput is further superior to training on usertweets separately and assign user’s label on themode of user’s tweets’ predictions.
Anthology ID:
W19-4636
Volume:
Proceedings of the Fourth Arabic Natural Language Processing Workshop
Month:
August
Year:
2019
Address:
Florence, Italy
Venues:
ACL | WANLP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
274–278
Language:
URL:
https://www.aclweb.org/anthology/W19-4636
DOI:
10.18653/v1/W19-4636
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/W19-4636.pdf