Fabrizio Gotti


2019

pdf bib
WiRe57 : A Fine-Grained Benchmark for Open Information Extraction
William Lechelle | Fabrizio Gotti | Phillippe Langlais
Proceedings of the 13th Linguistic Annotation Workshop

We build a reference for the task of Open Information Extraction, on five documents. We tentatively resolve a number of issues that arise, including coreference and granularity, and we take steps toward addressing inference, a significant problem. We seek to better pinpoint the requirements for the task. We produce our annotation guidelines specifying what is correct to extract and what is not. In turn, we use this reference to score existing Open IE systems. We address the non-trivial problem of evaluating the extractions produced by systems against the reference tuples, and share our evaluation script. Among seven compared extractors, we find the MinIE system to perform best.

2014

pdf bib
A Tool for the Automatic Insertion of Diacritics in French (Zodiac : Insertion automatique des signes diacritiques du français) [in French]
Fabrizio Gotti | Guy Lapalme
Proceedings of TALN 2014 (Volume 3: System Demonstrations)

pdf bib
Hashtag Occurrences, Layout and Translation: A Corpus-driven Analysis of Tweets Published by the Canadian Government
Fabrizio Gotti | Phillippe Langlais | Atefeh Farzindar
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

We present an aligned bilingual corpus of 8758 tweet pairs in French and English, derived from Canadian government agencies. Hashtags appear in a tweet’s prologue, announcing its topic, or in the tweet’s text in lieu of traditional words, or in an epilogue. Hashtags are words prefixed with a pound sign in 80% of the cases. The rest is mostly multiword hashtags, for which we describe a segmentation algorithm. A manual analysis of the bilingual alignment of 5000 hashtags shows that 5% (French) to 18% (English) of them don’t have a counterpart in their containing tweet’s translation. This analysis shows that 80% of multiword hashtags are correctly translated by humans, and that the mistranslation of the rest may be due to incomplete translation directives regarding social media. We show how these resources and their analysis can guide the design of a machine translation pipeline, and its evaluation. A baseline system implementing a tweet-specific tokenizer yields promising results. The system is improved by translating epilogues, prologues, and text separately. We attempt to feed the SMT engine with the original hashtag and some alternatives (“dehashed” version or a segmented version of multiword hashtags), but translation quality improves at the cost of hashtag recall.

2013

pdf bib
Translating Government Agencies’ Tweet Feeds: Specificities, Problems and (a few) Solutions
Fabrizio Gotti | Philippe Langlais | Atefeh Farzindar
Proceedings of the Workshop on Language Analysis in Social Media

2006

pdf bib
MOOD: A Modular Object-Oriented Decoder for Statistical Machine Translation
Alexandre Patry | Fabrizio Gotti | Philippe Langlais
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

We present an Open Source framework called MOOD developed in order tofacilitate the development of a Statistical Machine Translation Decoder.MOOD has been modularized using an object-oriented approach which makes itespecially suitable for the fast development of state-of-the-art decoders. Asa proof of concept, a clone of the pharaoh decoder has been implemented andevaluated. This clone named ramses is part of the current distribution of MOOD.

pdf bib
Phrase-Based SMT with Shallow Tree-Phrases
Philippe Langlais | Fabrizio Gotti
Proceedings on the Workshop on Statistical Machine Translation

pdf bib
Mood at work: Ramses versus Pharaoh
Alexandre Patry | Fabrizio Gotti | Philippe Langlais
Proceedings on the Workshop on Statistical Machine Translation

2005

pdf bib
NUKTI: English-Inuktitut Word Alignment System Description
Philippe Langlais | Fabrizio Gotti | Guihong Cao
Proceedings of the ACL Workshop on Building and Using Parallel Texts

pdf bib
RALI: SMT Shared Task System Description
Philippe Langlais | Guihong Cao | Fabrizio Gotti
Proceedings of the ACL Workshop on Building and Using Parallel Texts