Bruno Guillaume


pdf bib
Rigor Mortis: Annotating MWEs with a Gamified Platform
Karën Fort | Bruno Guillaume | Yann-Alan Pilatte | Mathieu Constant | Nicolas Lefèbvre
Proceedings of the 12th Language Resources and Evaluation Conference

We present here Rigor Mortis, a gamified crowdsourcing platform designed to evaluate the intuition of the speakers, then train them to annotate multi-word expressions (MWEs) in French corpora. We previously showed that the speakers’ intuition is reasonably good (65% in recall on non-fixed MWE). We detail here the annotation results, after a training phase using some of the tests developed in the PARSEME-FR project.

pdf bib
When Collaborative Treebank Curation Meets Graph Grammars
Gaël Guibon | Marine Courtin | Kim Gerdes | Bruno Guillaume
Proceedings of the 12th Language Resources and Evaluation Conference

In this paper we present Arborator-Grew, a collaborative annotation tool for treebank development. Arborator-Grew combines the features of two preexisting tools: Arborator and Grew. Arborator is a widely used collaborative graphical online dependency treebank annotation tool. Grew is a tool for graph querying and rewriting specialized in structures needed in NLP, i.e. syntactic and semantic dependency trees and graphs. Grew also has an online version, Grew-match, where all Universal Dependencies treebanks in their classical, deep and surface-syntactic flavors can be queried. Arborator-Grew is a complete redevelopment and modernization of Arborator, replacing its own internal database storage by a new Grew API, which adds a powerful query tool to Arborator’s existing treebank creation and correction features. This includes complex access control for parallel expert and crowd-sourced annotation, tree comparison visualization, and various exercise modes for teaching and training of annotators. Arborator-Grew opens up new paths of collectively creating, updating, maintaining, and curating syntactic treebanks and semantic graph banks.

pdf bib
A French Version of the FraCaS Test Suite
Maxime Amblard | Clément Beysson | Philippe de Groote | Bruno Guillaume | Sylvain Pogodalla
Proceedings of the 12th Language Resources and Evaluation Conference

This paper presents a French version of the FraCaS test suite. This test suite, originally written in English, contains problems illustrating semantic inference in natural language. We describe linguistic choices we had to make when translating the FraCaS test suite in French, and discuss some of the issues that were raised by the translation. We also report an experiment we ran in order to test both the translation and the logical semantics underlying the problems of the test suite. This provides a way of checking formal semanticists’ hypotheses against actual semantic capacity of speakers (in the present case, French speakers), and allow us to compare the results we obtained with the ones of similar experiments that have been conducted for other languages.

pdf bib
Edition 1.2 of the PARSEME Shared Task on Semi-supervised Identification of Verbal Multiword Expressions
Carlos Ramisch | Agata Savary | Bruno Guillaume | Jakub Waszczuk | Marie Candito | Ashwini Vaidya | Verginica Barbu Mititelu | Archna Bhatia | Uxoa Iñurrieta | Voula Giouli | Tunga Gungor | Menghan Jiang | Timm Lichte | Chaya Liebeskind | Johanna Monti | Renata Ramisch | Sara Stymne | Abigail Walsh | Hongzhi Xu
Proceedings of the Joint Workshop on Multiword Expressions and Electronic Lexicons

We present edition 1.2 of the PARSEME shared task on identification of verbal multiword expressions (VMWEs). Lessons learned from previous editions indicate that VMWEs have low ambiguity, and that the major challenge lies in identifying test instances never seen in the training data. Therefore, this edition focuses on unseen VMWEs. We have split annotated corpora so that the test corpora contain around 300 unseen VMWEs, and we provide non-annotated raw corpora to be used by complementary discovery methods. We released annotated and raw corpora in 14 languages, and this semi-supervised challenge attracted 7 teams who submitted 9 system results. This paper describes the effort of corpus creation, the task design, and the results obtained by the participating systems, especially their performance on unseen expressions.


pdf bib
Improving Surface-syntactic Universal Dependencies (SUD): MWEs and deep syntactic features
Kim Gerdes | Bruno Guillaume | Sylvain Kahane | Guy Perrier
Proceedings of the 18th International Workshop on Treebanks and Linguistic Theories (TLT, SyntaxFest 2019)


pdf bib
“Fingers in the Nose”: Evaluating Speakers’ Identification of Multi-Word Expressions Using a Slightly Gamified Crowdsourcing Platform
Karën Fort | Bruno Guillaume | Matthieu Constant | Nicolas Lefèbvre | Yann-Alan Pilatte
Proceedings of the Joint Workshop on Linguistic Annotation, Multiword Expressions and Constructions (LAW-MWE-CxG-2018)

This article presents the results we obtained in crowdsourcing French speakers’ intuition concerning multi-work expressions (MWEs). We developed a slightly gamified crowdsourcing platform, part of which is designed to test users’ ability to identify MWEs with no prior training. The participants perform relatively well at the task, with a recall reaching 65% for MWEs that do not behave as function words.

pdf bib
SUD or Surface-Syntactic Universal Dependencies: An annotation scheme near-isomorphic to UD
Kim Gerdes | Bruno Guillaume | Sylvain Kahane | Guy Perrier
Proceedings of the Second Workshop on Universal Dependencies (UDW 2018)

This article proposes a surface-syntactic annotation scheme called SUD that is near-isomorphic to the Universal Dependencies (UD) annotation scheme while following distributional criteria for defining the dependency tree structure and the naming of the syntactic functions. Rule-based graph transformation grammars allow for a bi-directional transformation of UD into SUD. The back-and-forth transformation can serve as an error-mining tool to assure the intra-language and inter-language coherence of the UD treebanks.


pdf bib
Vers l’annotation par le jeu de corpus (plus) complexes : le cas de la langue de spécialité (Towards (more) complex corpora annotation using a game with a purpose : the case of scientific language)
Karën Fort | Bruno Guillaume | Nicolas Lefebvre | Laura Ramírez | Mathilde Regnault | Mary Collins | Oksana Gavrilova | Tanti Kristanti
Actes des 24ème Conférence sur le Traitement Automatique des Langues Naturelles. Volume 2 - Articles courts

Nous avons précédemment montré qu’il est possible de faire produire des annotations syntaxiques de qualité par des participants à un jeu ayant un but. Nous présentons ici les résultats d’une expérience visant à évaluer leur production sur un corpus plus complexe, en langue de spécialité, en l’occurrence un corpus de textes scientifiques sur l’ADN. Nous déterminons précisément la complexité de ce corpus, puis nous évaluons les annotations en syntaxe de dépendances produites par les joueurs par rapport à une référence mise au point par des experts du domaine.

pdf bib
Enhanced UD Dependencies with Neutralized Diathesis Alternation
Marie Candito | Bruno Guillaume | Guy Perrier | Djamé Seddah
Proceedings of the Fourth International Conference on Dependency Linguistics (Depling 2017)


pdf bib
Crowdsourcing Complex Language Resources: Playing to Annotate Dependency Syntax
Bruno Guillaume | Karën Fort | Nicolas Lefebvre
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

This article presents the results we obtained on a complex annotation task (that of dependency syntax) using a specifically designed Game with a Purpose, ZombiLingo. We show that with suitable mechanisms (decomposition of the task, training of the players and regular control of the annotation quality during the game), it is possible to obtain annotations whose quality is significantly higher than that obtainable with a parser, provided that enough players participate. The source code of the game and the resulting annotated corpora (for French) are freely available.


pdf bib
Recherche de motifs de graphe en ligne
Bruno Guillaume
Actes de la 22e conférence sur le Traitement Automatique des Langues Naturelles. Démonstrations

Nous présentons un outil en ligne de recherche de graphes dans des corpus annotés en syntaxe.

pdf bib
Dependency Parsing with Graph Rewriting
Bruno Guillaume | Guy Perrier
Proceedings of the 14th International Conference on Parsing Technologies


pdf bib
Annotation scheme for deep dependency syntax of French (Un schéma d’annotation en dépendances syntaxiques profondes pour le français) [in French]
Guy Perrier | Marie Candito | Bruno Guillaume | Corentin Ribeyre | Karën Fort | Djamé Seddah
Proceedings of TALN 2014 (Volume 2: Short Papers)

pdf bib
ZOMBILINGO: eating heads to perform dependency syntax annotation (ZOMBILINGO : manger des têtes pour annoter en syntaxe de dépendances) [in French]
Karën Fort | Bruno Guillaume | Valentin Stern
Proceedings of TALN 2014 (Volume 3: System Demonstrations)

pdf bib
Deep Syntax Annotation of the Sequoia French Treebank
Marie Candito | Guy Perrier | Bruno Guillaume | Corentin Ribeyre | Karën Fort | Djamé Seddah | Éric de la Clergerie
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

We define a deep syntactic representation scheme for French, which abstracts away from surface syntactic variation and diathesis alternations, and describe the annotation of deep syntactic representations on top of the surface dependency trees of the Sequoia corpus. The resulting deep-annotated corpus, named deep-sequoia, is freely available, and hopefully useful for corpus linguistics studies and for training deep analyzers to prepare semantic analysis.

pdf bib
Mapping the Lexique des Verbes du Français (Lexicon of French Verbs) to a NLP lexicon using examples
Bruno Guillaume | Karën Fort | Guy Perrier | Paul Bédaride
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

This article presents experiments aiming at mapping the Lexique des Verbes du Français (Lexicon of French Verbs) to FRILEX, a Natural Language Processing (NLP) lexicon based on D ICOVALENCE. The two resources (Lexicon of French Verbs and D ICOVALENCE) were built by linguists, based on very different theories, which makes a direct mapping nearly impossible. We chose to use the examples provided in one of the resource to find implicit links between the two and make them explicit.


pdf bib
Formalizing an annotation guide : some experiments towards assisted agile annotation (Expériences de formalisation d’un guide d’annotation : vers l’annotation agile assistée) [in French]
Bruno Guillaume | Karën Fort
Proceedings of TALN 2013 (Volume 2: Short Papers)


pdf bib
Annotation sémantique du French Treebank à l’aide de la réécriture modulaire de graphes (Semantic Annotation of the French Treebank using Modular Graph Rewriting) [in French]
Bruno Guillaume | Guy Perrier
Proceedings of the Joint Conference JEP-TALN-RECITAL 2012, volume 2: TALN

pdf bib
Grew : un outil de réécriture de graphes pour le TAL (Grew: a Graph Rewriting Tool for NLP) [in French]
Bruno Guillaume | Guillame Bonfante | Paul Masson | Mathieu Morey | Guy Perrier
Proceedings of the Joint Conference JEP-TALN-RECITAL 2012, volume 5: Software Demonstrations


pdf bib
Modular Graph Rewriting to Compute Semantics
Guillaume Bonfante | Bruno Guillaume | Mathieu Morey | Guy Perrier
Proceedings of the Ninth International Conference on Computational Semantics (IWCS 2011)


pdf bib
Interaction Grammar for the Persian Language: Noun and Adjectival Phrases
Masood Ghayoomi | Bruno Guillaume
Proceedings of the 7th Workshop on Asian Language Resources (ALR7)

pdf bib
Dependency Constraints for Lexical Disambiguation
Guillaume Bonfante | Bruno Guillaume | Mathieu Morey
Proceedings of the 11th International Conference on Parsing Technologies (IWPT’09)


pdf bib
A Toolchain for Grammarians
Bruno Guillaume | Joseph Le Roux | Jonathan Marchand | Guy Perrier | Karën Fort | Jennifer Planul
Coling 2008: Companion volume: Demonstrations


pdf bib
PrepLex: A Lexicon of French Prepositions for Parsing
Karën Fort | Bruno Guillaume
Proceedings of the Fourth ACL-SIGSEM Workshop on Prepositions


pdf bib
Polarization and abstraction of grammatical formalisms as methods for lexical disambiguation
Guillaume Bonfante | Bruno Guillaume | Guy Perrier
COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics