Nadi Tomeh


2020

pdf bib
Proceedings of the Fifth Arabic Natural Language Processing Workshop
Imed Zitouni | Muhammad Abdul-Mageed | Houda Bouamor | Fethi Bougares | Mahmoud El-Haj | Nadi Tomeh | Wajdi Zaghouani
Proceedings of the Fifth Arabic Natural Language Processing Workshop

pdf bib
Multitask Easy-First Dependency Parsing: Exploiting Complementarities of Different Dependency Representations
Yash Kankanampati | Joseph Le Roux | Nadi Tomeh | Dima Taji | Nizar Habash
Proceedings of the 28th International Conference on Computational Linguistics

In this paper we present a parsing model for projective dependency trees which takes advantage of the existence of complementary dependency annotations which is the case in Arabic, with the availability of CATiB and UD treebanks. Our system performs syntactic parsing according to both annotation types jointly as a sequence of arc-creating operations, and partially created trees for one annotation are also available to the other as features for the score function. This method gives error reduction of 9.9% on CATiB and 6.1% on UD compared to a strong baseline, and ablation tests show that the main contribution of this reduction is given by sharing tree representation between tasks, and not simply sharing BiLSTM layers as is often performed in NLP multitask systems.

2019

pdf bib
Proceedings of the Fourth Arabic Natural Language Processing Workshop
Wassim El-Hajj | Lamia Hadrich Belguith | Fethi Bougares | Walid Magdy | Imed Zitouni | Nadi Tomeh | Mahmoud El-Haj
Proceedings of the Fourth Arabic Natural Language Processing Workshop

2017

pdf bib
Proceedings of the Third Arabic Natural Language Processing Workshop
Nizar Habash | Mona Diab | Kareem Darwish | Wassim El-Hajj | Hend Al-Khalifa | Houda Bouamor | Nadi Tomeh | Mahmoud El-Haj
Proceedings of the Third Arabic Natural Language Processing Workshop

2016

pdf bib
Deep Lexical Segmentation and Syntactic Parsing in the Easy-First Dependency Framework
Matthieu Constant | Joseph Le Roux | Nadi Tomeh
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Fouille de motifs et CRF pour la reconnaissance de symptômes dans les textes biomédicaux (Pattern mining and CRF for symptoms recognition in biomedical texts)
Pierre Holat | Nadi Tomeh | Thierry Charnois | Delphine Battistelli | Marie-Christine Jaulent | Jean-Philippe Métivier
Actes de la conférence conjointe JEP-TALN-RECITAL 2016. volume 2 : TALN (Articles longs)

Dans cet article, nous nous intéressons à l’extraction d’entités médicales de type symptôme dans les textes biomédicaux. Cette tâche est peu explorée dans la littérature et il n’existe pas à notre connaissance de corpus annoté pour entraîner un modèle d’apprentissage. Nous proposons deux approches faiblement supervisées pour extraire ces entités. Une première est fondée sur la fouille de motifs et introduit une nouvelle contrainte de similarité sémantique. La seconde formule la tache comme une tache d’étiquetage de séquences en utilisant les CRF (champs conditionnels aléatoires). Nous décrivons les expérimentations menées qui montrent que les deux approches sont complémentaires en termes d’évaluation quantitative (rappel et précision). Nous montrons en outre que leur combinaison améliore sensiblement les résultats.

2015

pdf bib
Classification de texte enrichie à l’aide de motifs séquentiels
Pierre Holat | Nadi Tomeh | Thierry Charnois
Actes de la 22e conférence sur le Traitement Automatique des Langues Naturelles. Articles courts

En classification de textes, la plupart des méthodes fondées sur des classifieurs statistiques utilisent des mots, ou des combinaisons de mots contigus, comme descripteurs. Si l’on veut prendre en compte plus d’informations le nombre de descripteurs non contigus augmente exponentiellement. Pour pallier à cette croissance, la fouille de motifs séquentiels permet d’extraire, de façon efficace, un nombre réduit de descripteurs qui sont à la fois fréquents et pertinents grâce à l’utilisation de contraintes. Dans ce papier, nous comparons l’utilisation de motifs fréquents sous contraintes et l’utilisation de motifs -libres, comme descripteurs. Nous montrons les avantages et inconvénients de chaque type de motif.

2014

pdf bib
LIPN: Introducing a new Geographical Context Similarity Measure and a Statistical Similarity Measure based on the Bhattacharyya coefficient
Davide Buscaldi | Jorge García Flores | Joseph Le Roux | Nadi Tomeh | Belém Priego Sanchez
Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014)

pdf bib
Generalized Character-Level Spelling Error Correction
Noura Farra | Nadi Tomeh | Alla Rozovskaya | Nizar Habash
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
A Pipeline Approach to Supervised Error Correction for the QALB-2014 Shared Task
Nadi Tomeh | Nizar Habash | Ramy Eskander | Joseph Le Roux
Proceedings of the EMNLP 2014 Workshop on Arabic Natural Language Processing (ANLP)

pdf bib
Ontology-based Technical Text Annotation
François Lévy | Nadi Tomeh | Yue Ma
Proceedings of the COLING Workshop on Synchronic and Diachronic Approaches to Analyzing Technical Language

pdf bib
Large Scale Arabic Error Annotation: Guidelines and Framework
Wajdi Zaghouani | Behrang Mohit | Nizar Habash | Ossama Obeid | Nadi Tomeh | Alla Rozovskaya | Noura Farra | Sarah Alkuhlani | Kemal Oflazer
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

We present annotation guidelines and a web-based annotation framework developed as part of an effort to create a manually annotated Arabic corpus of errors and corrections for various text types. Such a corpus will be invaluable for developing Arabic error correction tools, both for training models and as a gold standard for evaluating error correction algorithms. We summarize the guidelines we created. We also describe issues encountered during the training of the annotators, as well as problems that are specific to the Arabic language that arose during the annotation process. Finally, we present the annotation tool that was developed as part of this project, the annotation pipeline, and the quality of the resulting annotations.

2013

pdf bib
A Web-based Annotation Framework For Large-Scale Text Correction
Ossama Obeid | Wajdi Zaghouani | Behrang Mohit | Nizar Habash | Kemal Oflazer | Nadi Tomeh
The Companion Volume of the Proceedings of IJCNLP 2013: System Demonstrations

pdf bib
Reranking with Linguistic and Semantic Features for Arabic Optical Character Recognition
Nadi Tomeh | Nizar Habash | Ryan Roth | Noura Farra | Pradeep Dasigi | Mona Diab
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
Morphological Analysis and Disambiguation for Dialectal Arabic
Nizar Habash | Ryan Roth | Owen Rambow | Ramy Eskander | Nadi Tomeh
Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Processing Spontaneous Orthography
Ramy Eskander | Nizar Habash | Owen Rambow | Nadi Tomeh
Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

2012

pdf bib
HadoopPerceptron: a Toolkit for Distributed Perceptron Training and Prediction with MapReduce
Andrea Gesmundo | Nadi Tomeh
Proceedings of the Demonstrations at the 13th Conference of the European Chapter of the Association for Computational Linguistics

pdf bib
Improving Relative-Entropy Pruning using Statistical Significance
Wang Ling | Nadi Tomeh | Guang Xiang | Isabel Trancoso | Alan Black
Proceedings of COLING 2012: Posters

2011

pdf bib
Discriminative Weighted Alignment Matrices For Statistical Machine Translation
Nadi Tomeh | Alexandre Allauzen | François Yvon
Proceedings of the 15th Annual conference of the European Association for Machine Translation