Germán Sanchis-Trilles

Also published as: Germán Sanchis, Germán Sanchis Trilles


2019

pdf bib
Filtering of Noisy Parallel Corpora Based on Hypothesis Generation
Zuzanna Parcheta | Germán Sanchis-Trilles | Francisco Casacuberta
Proceedings of the Fourth Conference on Machine Translation (Volume 3: Shared Task Papers, Day 2)

The filtering task of noisy parallel corpora in WMT2019 aims to challenge participants to create filtering methods to be useful for training machine translation systems. In this work, we introduce a noisy parallel corpora filtering system based on generating hypotheses by means of a translation model. We train translation models in both language pairs: Nepali–English and Sinhala–English using provided parallel corpora. We select the training subset for three language pairs (Nepali, Sinhala and Hindi to English) jointly using bilingual cross-entropy selection to create the best possible translation model for both language pairs. Once the translation models are trained, we translate the noisy corpora and generate a hypothesis for each sentence pair. We compute the smoothed BLEU score between the target sentence and generated hypothesis. In addition, we apply several rules to discard very noisy or inadequate sentences which can lower the translation score. These heuristics are based on sentence length, source and target similarity and source language detection. We compare our results with the baseline published on the shared task website, which uses the Zipporah model, over which we achieve significant improvements in one of the conditions in the shared task. The designed filtering system is domain independent and all experiments are conducted using neural machine translation.

2014

pdf bib
CASMACAT: A Computer-assisted Translation Workbench
Vicent Alabau | Christian Buck | Michael Carl | Francisco Casacuberta | Mercedes García-Martínez | Ulrich Germann | Jesús González-Rubio | Robin Hill | Philipp Koehn | Luis Leiva | Bartolomé Mesa-Lao | Daniel Ortiz-Martínez | Herve Saint-Amand | Germán Sanchis Trilles | Chara Tsoukala
Proceedings of the Demonstrations at the 14th Conference of the European Chapter of the Association for Computational Linguistics

pdf bib
Proceedings of the EACL 2014 Workshop on Humans and Computer-assisted Translation
Ulrich Germann | Michael Carl | Philipp Koehn | Germán Sanchis-Trilles | Francisco Casacuberta | Robin Hill | Sharon O’Brien
Proceedings of the EACL 2014 Workshop on Humans and Computer-assisted Translation

bib
Efficient wordgraph for interactive translation prediction
Germán Sanchis-Trilles | Daniel Ortiz-Martínez | Francisco Casacuberta
Proceedings of the 17th Annual conference of the European Association for Machine Translation

pdf bib
Evaluating the effects of interactivity in a post-editing workbench
Nancy Underwood | Bartolomé Mesa-Lao | Mercedes García Martínez | Michael Carl | Vicent Alabau | Jesús González-Rubio | Luis A. Leiva | Germán Sanchis-Trilles | Daniel Ortíz-Martínez | Francisco Casacuberta
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

This paper describes the field trial and subsequent evaluation of a post-editing workbench which is currently under development in the EU-funded CasMaCat project. Based on user evaluations of the initial prototype of the workbench, this second prototype of the workbench includes a number of interactive features designed to improve productivity and user satisfaction. Using CasMaCat’s own facilities for logging keystrokes and eye tracking, data were collected from nine post-editors in a professional setting. These data were then used to investigate the effects of the interactive features on productivity, quality, user satisfaction and cognitive load as reflected in the post-editors’ gaze activity. These quantitative results are combined with the qualitative results derived from user questionnaires and interviews conducted with all the participants.

pdf bib
Online optimisation of log-linear weights in interactive machine translation
Mara Chinea Rios | Germán Sanchis-Trilles | Daniel Ortiz-Martínez | Francisco Casacuberta
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

Whenever the quality provided by a machine translation system is not enough, a human expert is required to correct the sentences provided by the machine translation system. In such a setup, it is crucial that the system is able to learn from the errors that have already been corrected. In this paper, we analyse the applicability of discriminative ridge regression for learning the log-linear weights of a state-of-the-art machine translation system underlying an interactive machine translation framework, with encouraging results.

2012

pdf bib
Does more data always yield better translations?
Guillem Gascó | Martha-Alicia Rocha | Germán Sanchis-Trilles | Jesús Andrés-Ferrer | Francisco Casacuberta
Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics

2011

pdf bib
Bilingual segmentation for phrasetable pruning in Statistical Machine Translation
Germán Sanchis-Trilles | Daniel Ortiz-Martínez | Jesús González-Rubio | Jorge González
Proceedings of the 15th Annual conference of the European Association for Machine Translation

2010

pdf bib
A Deterministic Annealing-Based Training Algorithm For Statistical Machine Translation Models
Pascual Martínez Gómez | Kei Hashimoto | Yoshihiko Nankaku | Keiichi Tokuda | Germán Sanchis-Trilles
Proceedings of the 14th Annual conference of the European Association for Machine Translation

pdf bib
Online Language Model adaptation via N-gram Mixtures for Statistical Machine Translation
Germán Sanchis-Trilles | Mauro Cettolo
Proceedings of the 14th Annual conference of the European Association for Machine Translation

pdf bib
UPV-PRHLT English–Spanish System for WMT10
Germán Sanchis-Trilles | Jesús Andrés-Ferrer | Guillem Gascó | Jesús González-Rubio | Pascual Martínez-Gómez | Martha-Alicia Rocha | Joan-Andreu Sánchez | Francisco Casacuberta
Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR

pdf bib
UCH-UPV English–Spanish System for WMT10
Francisco Zamora-Martínez | Germán Sanchis-Trilles
Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR

pdf bib
The UPV-PRHLT Combination System for WMT 2010
Jesús González-Rubio | Germán Sanchis-Trilles | Joan-Andreu Sánchez | Jesús Andrés-Ferrer | Guillem Gascó | Pascual Martínez-Gómez | Martha-Alicia Rocha | Francisco Casacuberta
Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR

pdf bib
Log-linear weight optimisation via Bayesian Adaptation in Statistical Machine Translation
Germán Sanchis-Trilles | Francisco Casacuberta
Coling 2010: Posters

2008

pdf bib
Using Parsed Corpora for Estimating Stochastic Inversion Transduction Grammars
Germán Sanchis | Joan Andreu Sánchez
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

An important problem when using Stochastic Inversion Transduction Grammars is their computational cost. More specifically, when dealing with corpora such as Europarl. only one iteration of the estimation algorithm becomes prohibitive. In this work, we apply a reduction of the cost by taking profit of the bracketing information in parsed corpora and show machine translation results obtained with a bracketed Europarl corpus, yielding interresting improvements when increasing the number of non-terminal symbols.

pdf bib
A novel alignment model inspired on IBM Model 1
Jesús González-Rubio | Germán Sanchis-Trilles | Alfons Juan | Francisco Casacuberta
Proceedings of the 12th Annual conference of the European Association for Machine Translation

pdf bib
Improving Interactive Machine Translation via Mouse Actions
Germán Sanchis-Trilles | Daniel Ortiz-Martínez | Jorge Civera | Francisco Casacuberta | Enrique Vidal | Hieu Hoang
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing