Frédéric Blain

Also published as: Frederic Blain


2020

pdf bib
Multimodal Quality Estimation for Machine Translation
Shu Okabe | Frédéric Blain | Lucia Specia
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

We propose approaches to Quality Estimation (QE) for Machine Translation that explore both text and visual modalities for Multimodal QE. We compare various multimodality integration and fusion strategies. For both sentence-level and document-level predictions, we show that state-of-the-art neural and feature-based QE frameworks obtain better results when using the additional modality.

pdf bib
Quality In, Quality Out: Learning from Actual Mistakes
Frederic Blain | Nikolaos Aletras | Lucia Specia
Proceedings of the 22nd Annual Conference of the European Association for Machine Translation

Approaches to Quality Estimation (QE) of machine translation have shown promising results at predicting quality scores for translated sentences. However, QE models are often trained on noisy approximations of quality annotations derived from the proportion of post-edited words in translated sentences instead of direct human annotations of translation errors. The latter is a more reliable ground-truth but more expensive to obtain. In this paper, we present the first attempt to model the task of predicting the proportion of actual translation errors in a sentence while minimising the need for direct human annotation. For that purpose, we use transfer-learning to leverage large scale noisy annotations and small sets of high-fidelity human annotated translation errors to train QE models. Experiments on four language pairs and translations obtained by statistical and neural models show consistent gains over strong baselines.

pdf bib
Unsupervised Quality Estimation for Neural Machine Translation
Marina Fomicheva | Shuo Sun | Lisa Yankovskaya | Frédéric Blain | Francisco Guzmán | Mark Fishel | Nikolaos Aletras | Vishrav Chaudhary | Lucia Specia
Transactions of the Association for Computational Linguistics, Volume 8

Quality Estimation (QE) is an important component in making Machine Translation (MT) useful in real-world applications, as it is aimed to inform the user on the quality of the MT output at test time. Existing approaches require large amounts of expert annotated data, computation, and time for training. As an alternative, we devise an unsupervised approach to QE where no training or access to additional resources besides the MT system itself is required. Different from most of the current work that treats the MT system as a black box, we explore useful information that can be extracted from the MT system as a by-product of translation. By utilizing methods for uncertainty quantification, we achieve very good correlation with human judgments of quality, rivaling state-of-the-art supervised QE models. To evaluate our approach we collect the first dataset that enables work on both black-box and glass-box approaches to QE.

2018

pdf bib
deepQuest: A Framework for Neural-based Quality Estimation
Julia Ive | Frédéric Blain | Lucia Specia
Proceedings of the 27th International Conference on Computational Linguistics

Predicting Machine Translation (MT) quality can help in many practical tasks such as MT post-editing. The performance of Quality Estimation (QE) methods has drastically improved recently with the introduction of neural approaches to the problem. However, thus far neural approaches have only been designed for word and sentence-level prediction. We present a neural framework that is able to accommodate neural QE approaches at these fine-grained levels and generalize them to the level of documents. We test the framework with two sentence-level neural QE approaches: a state of the art approach that requires extensive pre-training, and a new light-weight approach that we propose, which employs basic encoders. Our approach is significantly faster and yields performance improvements for a range of document-level quality estimation tasks. To our knowledge, this is the first neural architecture for document-level QE. In addition, for the first time we apply QE models to the output of both statistical and neural MT systems for a series of European languages and highlight the new challenges resulting from the use of neural MT.

pdf bib
Combining Quality Estimation and Automatic Post-editing to Enhance Machine Translation output
Rajen Chatterjee | Matteo Negri | Marco Turchi | Frédéric Blain | Lucia Specia
Proceedings of the 13th Conference of the Association for Machine Translation in the Americas (Volume 1: Research Track)

pdf bib
Findings of the WMT 2018 Shared Task on Quality Estimation
Lucia Specia | Frédéric Blain | Varvara Logacheva | Ramón Astudillo | André F. T. Martins
Proceedings of the Third Conference on Machine Translation: Shared Task Papers

We report the results of the WMT18 shared task on Quality Estimation, i.e. the task of predicting the quality of the output of machine translation systems at various granularity levels: word, phrase, sentence and document. This year we include four language pairs, three text domains, and translations produced by both statistical and neural machine translation systems. Participating teams from ten institutions submitted a variety of systems to different task variants and language pairs.

pdf bib
Sheffield Submissions for the WMT18 Quality Estimation Shared Task
Julia Ive | Carolina Scarton | Frédéric Blain | Lucia Specia
Proceedings of the Third Conference on Machine Translation: Shared Task Papers

In this paper we present the University of Sheffield submissions for the WMT18 Quality Estimation shared task. We discuss our submissions to all four sub-tasks, where ours is the only team to participate in all language pairs and variations (37 combinations). Our systems show competitive results and outperform the baseline in nearly all cases.

2017

pdf bib
Guiding Neural Machine Translation Decoding with External Knowledge
Rajen Chatterjee | Matteo Negri | Marco Turchi | Marcello Federico | Lucia Specia | Frédéric Blain
Proceedings of the Second Conference on Machine Translation

pdf bib
The QT21 Combined Machine Translation System for English to Latvian
Jan-Thorsten Peter | Hermann Ney | Ondřej Bojar | Ngoc-Quan Pham | Jan Niehues | Alex Waibel | Franck Burlot | François Yvon | Mārcis Pinnis | Valters Šics | Jasmijn Bastings | Miguel Rios | Wilker Aziz | Philip Williams | Frédéric Blain | Lucia Specia
Proceedings of the Second Conference on Machine Translation

pdf bib
Bilexical Embeddings for Quality Estimation
Frédéric Blain | Carolina Scarton | Lucia Specia
Proceedings of the Second Conference on Machine Translation

2016

pdf bib
Phrase Level Segmentation and Labelling of Machine Translation Errors
Frédéric Blain | Varvara Logacheva | Lucia Specia
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

This paper presents our work towards a novel approach for Quality Estimation (QE) of machine translation based on sequences of adjacent words, the so-called phrases. This new level of QE aims to provide a natural balance between QE at word and sentence-level, which are either too fine grained or too coarse levels for some applications. However, phrase-level QE implies an intrinsic challenge: how to segment a machine translation into sequence of words (contiguous or not) that represent an error. We discuss three possible segmentation strategies to automatically extract erroneous phrases. We evaluate these strategies against annotations at phrase-level produced by humans, using a new dataset collected for this purpose.

pdf bib
Sheffield Systems for the English-Romanian WMT Translation Task
Frédéric Blain | Xingyi Song | Lucia Specia
Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers

pdf bib
The QT21/HimL Combined Machine Translation System
Jan-Thorsten Peter | Tamer Alkhouli | Hermann Ney | Matthias Huck | Fabienne Braune | Alexander Fraser | Aleš Tamchyna | Ondřej Bojar | Barry Haddow | Rico Sennrich | Frédéric Blain | Lucia Specia | Jan Niehues | Alex Waibel | Alexandre Allauzen | Lauriane Aufrant | Franck Burlot | Elena Knyazeva | Thomas Lavergne | François Yvon | Mārcis Pinnis | Stella Frank
Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers

pdf bib
USFD’s Phrase-level Quality Estimation Systems
Varvara Logacheva | Frédéric Blain | Lucia Specia
Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers

pdf bib
USFD at SemEval-2016 Task 1: Putting different State-of-the-Arts into a Box
Ahmet Aker | Frederic Blain | Andres Duque | Marina Fomicheva | Jurica Seva | Kashif Shah | Daniel Beck
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)

2015

pdf bib
SHEF-NN: Translation Quality Estimation with Neural Networks
Kashif Shah | Varvara Logacheva | Gustavo Paetzold | Frederic Blain | Daniel Beck | Fethi Bougares | Lucia Specia
Proceedings of the Tenth Workshop on Statistical Machine Translation

pdf bib
Continuous Adaptation to User Feedback for Statistical Machine Translation
Frédéric Blain | Fethi Bougares | Amir Hazem | Loïc Barrault | Holger Schwenk
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

2014

pdf bib
The MateCat Tool
Marcello Federico | Nicola Bertoldi | Mauro Cettolo | Matteo Negri | Marco Turchi | Marco Trombetti | Alessandro Cattelan | Antonio Farina | Domenico Lupinetti | Andrea Martines | Alberto Massidda | Holger Schwenk | Loïc Barrault | Frederic Blain | Philipp Koehn | Christian Buck | Ulrich Germann
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: System Demonstrations

2012

pdf bib
Automatic Translation of Scientific Documents in the HAL Archive
Patrik Lambert | Holger Schwenk | Frédéric Blain
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

This paper describes the development of a statistical machine translation system between French and English for scientific papers. This system will be closely integrated into the French HAL open archive, a collection of more than 100.000 scientific papers. We describe the creation of in-domain parallel and monolingual corpora, the development of a domain specific translation system with the created resources, and its adaptation using monolingual resources only. These techniques allowed us to improve a generic system by more than 10 BLEU points.

pdf bib
Collaborative Machine Translation Service for Scientific texts
Patrik Lambert | Jean Senellart | Laurent Romary | Holger Schwenk | Florian Zipser | Patrice Lopez | Frédéric Blain
Proceedings of the Demonstrations at the 13th Conference of the European Chapter of the Association for Computational Linguistics