Loïc Barrault


2020

pdf bib
Évaluation de systèmes apprenant tout au long de la vie (Evaluation of lifelong learning systems )
Yevhenii Prokopalo | Sylvain Meignier | Olivier Galibert | Loïc Barrault | Anthony Larcher
Actes de la 6e conférence conjointe Journées d'Études sur la Parole (JEP, 33e édition), Traitement Automatique des Langues Naturelles (TALN, 27e édition), Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RÉCITAL, 22e édition). Volume 1 : Journées d'Études sur la Parole

Aujourd’hui les systèmes intelligents obtiennent d’excellentes performances dans de nombreux domaines lorsqu’ils sont entraînés par des experts en apprentissage automatique. Lorsque ces systèmes sont mis en production, leurs performances se dégradent au cours du temps du fait de l’évolution de leur environnement réel. Une adaptation de leur modèle par des experts en apprentissage automatique est possible mais très coûteuse alors que les sociétés utilisant ces systèmes disposent d’experts du domaine qui pourraient accompagner ces systèmes dans un apprentissage tout au long de la vie. Dans cet article nous proposons un cadre d’évaluation générique pour des systèmes apprenant tout au long de la vie (SATLV). Nous proposons d’évaluer l’apprentissage assisté par l’humain (actif ou interactif) et l’apprentissage au cours du temps.

pdf bib
Traduction automatique pour la normalisation du français du XVIIe siècle ()
Simon Gabay | Loïc Barrault
Actes de la 6e conférence conjointe Journées d'Études sur la Parole (JEP, 33e édition), Traitement Automatique des Langues Naturelles (TALN, 27e édition), Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RÉCITAL, 22e édition). Volume 2 : Traitement Automatique des Langues Naturelles

pdf bib
Simultaneous Machine Translation with Visual Context
Ozan Caglayan | Julia Ive | Veneta Haralampieva | Pranava Madhyastha | Loïc Barrault | Lucia Specia
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Simultaneous machine translation (SiMT) aims to translate a continuous input text stream into another language with the lowest latency and highest quality possible. The translation thus has to start with an incomplete source text, which is read progressively, creating the need for anticipation. In this paper, we seek to understand whether the addition of visual information can compensate for the missing source context. To this end, we analyse the impact of different multimodal approaches and visual features on state-of-the-art SiMT frameworks. Our results show that visual context is helpful and that visually-grounded models based on explicit object region information are much better than commonly used global features, reaching up to 3 BLEU points improvement under low latency scenarios. Our qualitative analysis illustrates cases where only the multimodal systems are able to translate correctly from English into gender-marked languages, as well as deal with differences in word order, such as adjective-noun placement between English and French.

2019

pdf bib
Étude de l’apprentissage par transfert de systèmes de traduction automatique neuronaux (Study on transfer learning in neural machine translation )
Adrien Bardet | Fethi Bougares | Loïc Barrault
Actes de la Conférence sur le Traitement Automatique des Langues Naturelles (TALN) PFIA 2019. Volume II : Articles courts

L’apprentissage par transfert est une solution au problème de l’apprentissage de systèmes de traduction automatique neuronaux pour des paires de langues peu dotées. Dans cet article, nous proposons une analyse de cette méthode. Nous souhaitons évaluer l’impact de la quantité de données et celui de la proximité des langues impliquées pour obtenir le meilleur transfert possible. Nous prenons en compte ces deux paramètres non seulement pour une tâche de traduction “classique” mais également lorsque les corpus de données font défaut. Enfin, il s’agit de proposer une approche où volume de données et proximité des langues sont combinées afin de ne plus avoir à trancher entre ces deux éléments.

pdf bib
Findings of the 2019 Conference on Machine Translation (WMT19)
Loïc Barrault | Ondřej Bojar | Marta R. Costa-jussà | Christian Federmann | Mark Fishel | Yvette Graham | Barry Haddow | Matthias Huck | Philipp Koehn | Shervin Malmasi | Christof Monz | Mathias Müller | Santanu Pal | Matt Post | Marcos Zampieri
Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)

This paper presents the results of the premier shared task organized alongside the Conference on Machine Translation (WMT) 2019. Participants were asked to build machine translation systems for any of 18 language pairs, to be evaluated on a test set of news stories. The main metric for this task is human judgment of translation quality. The task was also opened up to additional test suites to probe specific aspects of translation.

pdf bib
LIUM’s Contributions to the WMT2019 News Translation Task: Data and Systems for German-French Language Pairs
Fethi Bougares | Jane Wottawa | Anne Baillot | Loïc Barrault | Adrien Bardet
Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)

This paper describes the neural machine translation (NMT) systems of the LIUM Laboratory developed for the French↔German news translation task of the Fourth Conference onMachine Translation (WMT 2019). The chosen language pair is included for the first time in the WMT news translation task. We de-scribe how the training and the evaluation data was created. We also present our participation in the French↔German translation directions using self-attentional Transformer networks with small and big architectures.

pdf bib
Probing the Need for Visual Context in Multimodal Machine Translation
Ozan Caglayan | Pranava Madhyastha | Lucia Specia | Loïc Barrault
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Current work on multimodal machine translation (MMT) has suggested that the visual modality is either unnecessary or only marginally beneficial. We posit that this is a consequence of the very simple, short and repetitive sentences used in the only available dataset for the task (Multi30K), rendering the source text sufficient as context. In the general case, however, we believe that it is possible to combine visual and textual information in order to ground translations. In this paper we probe the contribution of the visual modality to state-of-the-art MMT models by conducting a systematic analysis where we partially deprive the models from source-side textual context. Our results show that under limited textual context, models are capable of leveraging the visual input to generate better translations. This contradicts the current belief that MMT models disregard the visual modality because of either the quality of the image features or the way they are integrated into the model.

2018

pdf bib
What you can cram into a single $&!#* vector: Probing sentence embeddings for linguistic properties
Alexis Conneau | German Kruszewski | Guillaume Lample | Loïc Barrault | Marco Baroni
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Although much effort has recently been devoted to training high-quality sentence embeddings, we still have a poor understanding of what they are capturing. “Downstream” tasks, often based on sentence classification, are commonly used to evaluate the quality of sentence representations. The complexity of the tasks makes it however difficult to infer what kind of information is present in the representations. We introduce here 10 probing tasks designed to capture simple linguistic features of sentences, and we use them to study embeddings generated by three different encoders trained in eight distinct ways, uncovering intriguing properties of both encoders and training methods.

pdf bib
Findings of the Third Shared Task on Multimodal Machine Translation
Loïc Barrault | Fethi Bougares | Lucia Specia | Chiraag Lala | Desmond Elliott | Stella Frank
Proceedings of the Third Conference on Machine Translation: Shared Task Papers

We present the results from the third shared task on multimodal machine translation. In this task a source sentence in English is supplemented by an image and participating systems are required to generate a translation for such a sentence into German, French or Czech. The image can be used in addition to (or instead of) the source sentence. This year the task was extended with a third target language (Czech) and a new test set. In addition, a variant of this task was introduced with its own test set where the source sentence is given in multiple languages: English, French and German, and participating systems are required to generate a translation in Czech. Seven teams submitted 45 different systems to the two variants of the task. Compared to last year, the performance of the multimodal submissions improved, but text-only systems remain competitive.

pdf bib
LIUM-CVC Submissions for WMT18 Multimodal Translation Task
Ozan Caglayan | Adrien Bardet | Fethi Bougares | Loïc Barrault | Kai Wang | Marc Masana | Luis Herranz | Joost van de Weijer
Proceedings of the Third Conference on Machine Translation: Shared Task Papers

This paper describes the multimodal Neural Machine Translation systems developed by LIUM and CVC for WMT18 Shared Task on Multimodal Translation. This year we propose several modifications to our previous multimodal attention architecture in order to better integrate convolutional features and refine them using encoder-side information. Our final constrained submissions ranked first for English→French and second for English→German language pairs among the constrained submissions according to the automatic evaluation metric METEOR.

2017

pdf bib
Very Deep Convolutional Networks for Text Classification
Alexis Conneau | Holger Schwenk | Loïc Barrault | Yann Lecun
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers

The dominant approach for many NLP tasks are recurrent neural networks, in particular LSTMs, and convolutional neural networks. However, these architectures are rather shallow in comparison to the deep convolutional networks which have pushed the state-of-the-art in computer vision. We present a new architecture (VDCNN) for text processing which operates directly at the character level and uses only small convolutions and pooling operations. We are able to show that the performance of this model increases with the depth: using up to 29 convolutional layers, we report improvements over the state-of-the-art on several public text classification tasks. To the best of our knowledge, this is the first time that very deep convolutional nets have been applied to text processing.

pdf bib
Word Representations in Factored Neural Machine Translation
Franck Burlot | Mercedes García-Martínez | Loïc Barrault | Fethi Bougares | François Yvon
Proceedings of the Second Conference on Machine Translation

pdf bib
Findings of the Second Shared Task on Multimodal Machine Translation and Multilingual Image Description
Desmond Elliott | Stella Frank | Loïc Barrault | Fethi Bougares | Lucia Specia
Proceedings of the Second Conference on Machine Translation

pdf bib
LIUM Machine Translation Systems for WMT17 News Translation Task
Mercedes García-Martínez | Ozan Caglayan | Walid Aransa | Adrien Bardet | Fethi Bougares | Loïc Barrault
Proceedings of the Second Conference on Machine Translation

pdf bib
LIUM-CVC Submissions for WMT17 Multimodal Translation Task
Ozan Caglayan | Walid Aransa | Adrien Bardet | Mercedes García-Martínez | Fethi Bougares | Loïc Barrault | Marc Masana | Luis Herranz | Joost van de Weijer
Proceedings of the Second Conference on Machine Translation

pdf bib
Supervised Learning of Universal Sentence Representations from Natural Language Inference Data
Alexis Conneau | Douwe Kiela | Holger Schwenk | Loïc Barrault | Antoine Bordes
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

Many modern NLP systems rely on word embeddings, previously trained in an unsupervised manner on large corpora, as base features. Efforts to obtain embeddings for larger chunks of text, such as sentences, have however not been so successful. Several attempts at learning unsupervised representations of sentences have not reached satisfactory enough performance to be widely adopted. In this paper, we show how universal sentence representations trained using the supervised data of the Stanford Natural Language Inference datasets can consistently outperform unsupervised methods like SkipThought vectors on a wide range of transfer tasks. Much like how computer vision uses ImageNet to obtain features, which can then be transferred to other tasks, our work tends to indicate the suitability of natural language inference for transfer learning to other NLP tasks. Our encoder is publicly available.

2016

pdf bib
Does Multimodality Help Human and Machine for Translation and Image Captioning?
Ozan Caglayan | Walid Aransa | Yaxing Wang | Marc Masana | Mercedes García-Martínez | Fethi Bougares | Loïc Barrault | Joost van de Weijer
Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers

pdf bib
SHEF-LIUM-NN: Sentence level Quality Estimation with Neural Network Features
Kashif Shah | Fethi Bougares | Loïc Barrault | Lucia Specia
Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers

2015

pdf bib
Incremental Adaptation Strategies for Neural Network Language Models
Alex Ter-Sarkisov | Holger Schwenk | Fethi Bougares | Loïc Barrault
Proceedings of the 3rd Workshop on Continuous Vector Space Models and their Compositionality

pdf bib
Continuous Adaptation to User Feedback for Statistical Machine Translation
Frédéric Blain | Fethi Bougares | Amir Hazem | Loïc Barrault | Holger Schwenk
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

2014

pdf bib
Using Hypothesis Selection Based Features for Confusion Network MT System Combination
Sahar Ghannay | Loïc Barrault
Proceedings of the 3rd Workshop on Hybrid Approaches to Machine Translation (HyTra)

pdf bib
The MateCat Tool
Marcello Federico | Nicola Bertoldi | Mauro Cettolo | Matteo Negri | Marco Turchi | Marco Trombetti | Alessandro Cattelan | Antonio Farina | Domenico Lupinetti | Andrea Martines | Alberto Massidda | Holger Schwenk | Loïc Barrault | Frederic Blain | Philipp Koehn | Christian Buck | Ulrich Germann
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: System Demonstrations

2013

pdf bib
Multimodal Comparable Corpora as Resources for Extracting Parallel Data: Parallel Phrases Extraction
Haithem Afli | Loïc Barrault | Holger Schwenk
Proceedings of the Sixth International Joint Conference on Natural Language Processing

pdf bib
Proceedings of RECITAL 2013
Florian Boudin | Loïc Barrault
Proceedings of RECITAL 2013

2012

pdf bib
Traduction automatique à partir de corpus comparables: extraction de phrases parallèles à partir de données comparables multimodales (Automatic Translation from Comparable corpora : extracting parallel sentences from multimodal comparable corpora) [in French]
Haithem Afli | Loïc Barrault | Holger Schwenk
Proceedings of the Joint Conference JEP-TALN-RECITAL 2012, volume 2: TALN

pdf bib
LIUM’s SMT Machine Translation Systems for WMT 2012
Christophe Servan | Patrik Lambert | Anthony Rousseau | Holger Schwenk | Loïc Barrault
Proceedings of the Seventh Workshop on Statistical Machine Translation

2011

pdf bib
MANY improvements for WMT’11
Loïc Barrault
Proceedings of the Sixth Workshop on Statistical Machine Translation

pdf bib
LIUM’s SMT Machine Translation Systems for WMT 2011
Holger Schwenk | Patrik Lambert | Loïc Barrault | Christophe Servan | Sadaf Abdul-Rauf | Haithem Afli | Kashif Shah
Proceedings of the Sixth Workshop on Statistical Machine Translation

pdf bib
Parametric Weighting of Parallel Data for Statistical Machine Translation
Kashif Shah | Loïc Barrault | Holger Schwenk
Proceedings of 5th International Joint Conference on Natural Language Processing

2010

pdf bib
MANY: Open Source MT System Combination at WMT’10
Loïc Barrault
Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR

pdf bib
Translation Model Adaptation by Resampling
Kashif Shah | Loïc Barrault | Holger Schwenk
Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR

2009

pdf bib
SMT and SPE Machine Translation Systems for WMT‘09
Holger Schwenk | Sadaf Abdul-Rauf | Loïc Barrault | Jean Senellart
Proceedings of the Fourth Workshop on Statistical Machine Translation