Go Inoue


2020

pdf bib
CAMeL Tools: An Open Source Python Toolkit for Arabic Natural Language Processing
Ossama Obeid | Nasser Zalmout | Salam Khalifa | Dima Taji | Mai Oudah | Bashar Alhafni | Go Inoue | Fadhl Eryani | Alexander Erdmann | Nizar Habash
Proceedings of the 12th Language Resources and Evaluation Conference

We present CAMeL Tools, a collection of open-source tools for Arabic natural language processing in Python. CAMeL Tools currently provides utilities for pre-processing, morphological modeling, Dialect Identification, Named Entity Recognition and Sentiment Analysis. In this paper, we describe the design of CAMeL Tools and the functionalities it provides.

2018

pdf bib
A Parallel Corpus of Arabic-Japanese News Articles
Go Inoue | Nizar Habash | Yuji Matsumoto | Hiroyuki Aoyama
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

2017

pdf bib
Joint Prediction of Morphosyntactic Categories for Fine-Grained Arabic Part-of-Speech Tagging Exploiting Tag Dictionary Information
Go Inoue | Hiroyuki Shindo | Yuji Matsumoto
Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017)

Part-of-speech (POS) tagging for morphologically rich languages such as Arabic is a challenging problem because of their enormous tag sets. One reason for this is that in the tagging scheme for such languages, a complete POS tag is formed by combining tags from multiple tag sets defined for each morphosyntactic category. Previous approaches in Arabic POS tagging applied one model for each morphosyntactic tagging task, without utilizing shared information between the tasks. In this paper, we propose an approach that utilizes this information by jointly modeling multiple morphosyntactic tagging tasks with a multi-task learning framework. We also propose a method of incorporating tag dictionary information into our neural models by combining word representations with representations of the sets of possible tags. Our experiments showed that the joint model with tag dictionary information results in an accuracy of 91.38% on the Penn Arabic Treebank data set, with an absolute improvement of 2.11% over the current state-of-the-art tagger.