Bill Byrne


2020

pdf bib
The Teacher-Student Chatroom Corpus
Andrew Caines | Helen Yannakoudakis | Helena Edmondson | Helen Allen | Pascual Pérez-Paredes | Bill Byrne | Paula Buttery
Proceedings of the 9th Workshop on NLP for Computer Assisted Language Learning

pdf bib
Reducing Gender Bias in Neural Machine Translation as a Domain Adaptation Problem
Danielle Saunders | Bill Byrne
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Training data for NLP tasks often exhibits gender bias in that fewer sentences refer to women than to men. In Neural Machine Translation (NMT) gender bias has been shown to reduce translation quality, particularly when the target language has grammatical gender. The recent WinoMT challenge set allows us to measure this effect directly (Stanovsky et al, 2019) Ideally we would reduce system bias by simply debiasing all data prior to training, but achieving this effectively is itself a challenge. Rather than attempt to create a ‘balanced’ dataset, we use transfer learning on a small set of trusted, gender-balanced examples. This approach gives strong and consistent improvements in gender debiasing with much less computational cost than training from scratch. A known pitfall of transfer learning on new domains is ‘catastrophic forgetting’, which we address at adaptation and inference time. During adaptation we show that Elastic Weight Consolidation allows a performance trade-off between general translation quality and bias reduction. At inference time we propose a lattice-rescoring scheme which outperforms all systems evaluated in Stanovsky et al, 2019 on WinoMT with no degradation of general test set BLEU. We demonstrate our approach translating from English into three languages with varied linguistic properties and data availability.

pdf bib
Using Context in Neural Machine Translation Training Objectives
Danielle Saunders | Felix Stahlberg | Bill Byrne
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

We present Neural Machine Translation (NMT) training using document-level metrics with batch-level documents. Previous sequence-objective approaches to NMT training focus exclusively on sentence-level metrics like sentence BLEU which do not correspond to the desired evaluation metric, typically document BLEU. Meanwhile research into document-level NMT training focuses on data or model architecture rather than training procedure. We find that each of these lines of research has a clear space in it for the other, and propose merging them with a scheme that allows a document-level evaluation metric to be used in the NMT training objective. We first sample pseudo-documents from sentence samples. We then approximate the expected document BLEU gradient with Monte Carlo sampling for use as a cost function in Minimum Risk Training (MRT). This two-level sampling procedure gives NMT performance gains over sequence MRT and maximum-likelihood training. We demonstrate that training is more robust for document-level metrics than with sequence metrics. We further demonstrate improvements on NMT with TER and Grammatical Error Correction (GEC) using GLEU, both metrics used at the document level for evaluations.

2019

pdf bib
Semi-Supervised Bootstrapping of Dialogue State Trackers for Task-Oriented Modelling
Bo-Hsiang Tseng | Marek Rei | Paweł Budzianowski | Richard Turner | Bill Byrne | Anna Korhonen
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Dialogue systems benefit greatly from optimizing on detailed annotations, such as transcribed utterances, internal dialogue state representations and dialogue act labels. However, collecting these annotations is expensive and time-consuming, holding back development in the area of dialogue modelling. In this paper, we investigate semi-supervised learning methods that are able to reduce the amount of required intermediate labelling. We find that by leveraging un-annotated data instead, the amount of turn-level annotations of dialogue state can be significantly reduced when building a neural dialogue system. Our analysis on the MultiWOZ corpus, covering a range of domains and topics, finds that annotations can be reduced by up to 30% while maintaining equivalent system performance. We also describe and evaluate the first end-to-end dialogue model created for the MultiWOZ corpus.

pdf bib
On NMT Search Errors and Model Errors: Cat Got Your Tongue?
Felix Stahlberg | Bill Byrne
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

We report on search errors and model errors in neural machine translation (NMT). We present an exact inference procedure for neural sequence models based on a combination of beam search and depth-first search. We use our exact search to find the global best model scores under a Transformer base model for the entire WMT15 English-German test set. Surprisingly, beam search fails to find these global best model scores in most cases, even with a very large beam size of 100. For more than 50% of the sentences, the model in fact assigns its global best score to the empty translation, revealing a massive failure of neural models in properly accounting for adequacy. We show by constraining search with a minimum translation length that at the root of the problem of empty translations lies an inherent bias towards shorter translations. We conclude that vanilla NMT in its current form requires just the right amount of beam search errors, which, from a modelling perspective, is a highly unsatisfactory conclusion indeed, as the model often prefers an empty translation.

pdf bib
Taskmaster-1: Toward a Realistic and Diverse Dialog Dataset
Bill Byrne | Karthik Krishnamoorthi | Chinnadhurai Sankar | Arvind Neelakantan | Ben Goodrich | Daniel Duckworth | Semih Yavuz | Amit Dubey | Kyu-Young Kim | Andy Cedilnik
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

A significant barrier to progress in data-driven approaches to building dialog systems is the lack of high quality, goal-oriented conversational data. To help satisfy this elementary requirement, we introduce the initial release of the Taskmaster-1 dataset which includes 13,215 task-based dialogs comprising six domains. Two procedures were used to create this collection, each with unique advantages. The first involves a two-person, spoken “Wizard of Oz” (WOz) approach in which trained agents and crowdsourced workers interact to complete the task while the second is “self-dialog” in which crowdsourced workers write the entire dialog themselves. We do not restrict the workers to detailed scripts or to a small knowledge base and hence we observe that our dataset contains more realistic and diverse conversations in comparison to existing datasets. We offer several baseline models including state of the art neural seq2seq architectures with benchmark performance as well as qualitative human evaluations. Dialogs are labeled with API calls and arguments, a simple and cost effective approach which avoids the requirement of complex annotation schema. The layer of abstraction between the dialog model and the service provider API allows for a given model to interact with multiple services that provide similar functionally. Finally, the dataset will evoke interest in written vs. spoken language, discourse patterns, error handling and other linguistic phenomena related to dialog system research, development and design.

pdf bib
Domain Adaptive Inference for Neural Machine Translation
Danielle Saunders | Felix Stahlberg | Adrià de Gispert | Bill Byrne
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

We investigate adaptive ensemble weighting for Neural Machine Translation, addressing the case of improving performance on a new and potentially unknown domain without sacrificing performance on the original domain. We adapt sequentially across two Spanish-English and three English-German tasks, comparing unregularized fine-tuning, L2 and Elastic Weight Consolidation. We then report a novel scheme for adaptive NMT ensemble decoding by extending Bayesian Interpolation with source information, and report strong improvements across test domains without access to the domain label.

pdf bib
The CUED’s Grammatical Error Correction Systems for BEA-2019
Felix Stahlberg | Bill Byrne
Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications

We describe two entries from the Cambridge University Engineering Department to the BEA 2019 Shared Task on grammatical error correction. Our submission to the low-resource track is based on prior work on using finite state transducers together with strong neural language models. Our system for the restricted track is a purely neural system consisting of neural language models and neural machine translation models trained with back-translation and a combination of checkpoint averaging and fine-tuning – without the help of any additional tools like spell checkers. The latter system has been used inside a separate system combination entry in cooperation with the Cambridge University Computer Lab.

pdf bib
Neural and FST-based approaches to grammatical error correction
Zheng Yuan | Felix Stahlberg | Marek Rei | Bill Byrne | Helen Yannakoudakis
Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications

In this paper, we describe our submission to the BEA 2019 shared task on grammatical error correction. We present a system pipeline that utilises both error detection and correction models. The input text is first corrected by two complementary neural machine translation systems: one using convolutional networks and multi-task learning, and another using a neural Transformer-based system. Training is performed on publicly available data, along with artificial examples generated through back-translation. The n-best lists of these two machine translation systems are then combined and scored using a finite state transducer (FST). Finally, an unsupervised re-ranking system is applied to the n-best output of the FST. The re-ranker uses a number of error detection features to re-rank the FST n-best list and identify the final 1-best correction hypothesis. Our system achieves 66.75% F 0.5 on error correction (ranking 4th), and 82.52% F 0.5 on token-level error detection (ranking 2nd) in the restricted track of the shared task.

pdf bib
CUED@WMT19:EWC&LMs
Felix Stahlberg | Danielle Saunders | Adrià de Gispert | Bill Byrne
Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)

Two techniques provide the fabric of the Cambridge University Engineering Department’s (CUED) entry to the WMT19 evaluation campaign: elastic weight consolidation (EWC) and different forms of language modelling (LMs). We report substantial gains by fine-tuning very strong baselines on former WMT test sets using a combination of checkpoint averaging and EWC. A sentence-level Transformer LM and a document-level LM based on a modified Transformer architecture yield further gains. As in previous years, we also extract n-gram probabilities from SMT lattices which can be seen as a source-conditioned n-gram LM.

pdf bib
UCAM Biomedical Translation at WMT19: Transfer Learning Multi-domain Ensembles
Danielle Saunders | Felix Stahlberg | Bill Byrne
Proceedings of the Fourth Conference on Machine Translation (Volume 3: Shared Task Papers, Day 2)

The 2019 WMT Biomedical translation task involved translating Medline abstracts. We approached this using transfer learning to obtain a series of strong neural models on distinct domains, and combining them into multi-domain ensembles. We further experimented with an adaptive language-model ensemble weighting scheme. Our submission achieved the best submitted results on both directions of English-Spanish.

pdf bib
Coached Conversational Preference Elicitation: A Case Study in Understanding Movie Preferences
Filip Radlinski | Krisztian Balog | Bill Byrne | Karthik Krishnamoorthi
Proceedings of the 20th Annual SIGdial Meeting on Discourse and Dialogue

Conversational recommendation has recently attracted significant attention. As systems must understand users’ preferences, training them has called for conversational corpora, typically derived from task-oriented conversations. We observe that such corpora often do not reflect how people naturally describe preferences. We present a new approach to obtaining user preferences in dialogue: Coached Conversational Preference Elicitation. It allows collection of natural yet structured conversational preferences. Studying the dialogues in one domain, we present a brief quantitative analysis of how people describe movie preferences at scale. Demonstrating the methodology, we release the CCPE-M dataset to the community with over 500 movie preference dialogues expressing over 10,000 preferences.

pdf bib
Neural Grammatical Error Correction with Finite State Transducers
Felix Stahlberg | Christopher Bryant | Bill Byrne
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Grammatical error correction (GEC) is one of the areas in natural language processing in which purely neural models have not yet superseded more traditional symbolic models. Hybrid systems combining phrase-based statistical machine translation (SMT) and neural sequence models are currently among the most effective approaches to GEC. However, both SMT and neural sequence-to-sequence models require large amounts of annotated data. Language model based GEC (LM-GEC) is a promising alternative which does not rely on annotated training data. We show how to improve LM-GEC by applying modelling techniques based on finite state transducers. We report further gains by rescoring with neural language models. We show that our methods developed for LM-GEC can also be used with SMT systems if annotated training data is available. Our best system outperforms the best published result on the CoNLL-2014 test set, and achieves far better relative improvements over the SMT baselines than previous hybrid systems.

2018

pdf bib
Neural Machine Translation Decoding with Terminology Constraints
Eva Hasler | Adrià de Gispert | Gonzalo Iglesias | Bill Byrne
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers)

Despite the impressive quality improvements yielded by neural machine translation (NMT) systems, controlling their translation output to adhere to user-provided terminology constraints remains an open problem. We describe our approach to constrained neural decoding based on finite-state machines and multi-stack decoding which supports target-side constraints as well as constraints with corresponding aligned input text spans. We demonstrate the performance of our framework on multiple translation tasks and motivate the need for constrained decoding with attentions as a means of reducing misplacement and duplication when translating user constraints.

pdf bib
Accelerating NMT Batched Beam Decoding with LMBR Posteriors for Deployment
Gonzalo Iglesias | William Tambellini | Adrià De Gispert | Eva Hasler | Bill Byrne
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 3 (Industry Papers)

We describe a batched beam decoding algorithm for NMT with LMBR n-gram posteriors, showing that LMBR techniques still yield gains on top of the best recently reported results with Transformers. We also discuss acceleration strategies for deployment, and the effect of the beam size and batching on memory and speed.

pdf bib
Multi-representation ensembles and delayed SGD updates improve syntax-based NMT
Danielle Saunders | Felix Stahlberg | Adrià de Gispert | Bill Byrne
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

We explore strategies for incorporating target syntax into Neural Machine Translation. We specifically focus on syntax in ensembles containing multiple sentence representations. We formulate beam search over such ensembles using WFSTs, and describe a delayed SGD update training procedure that is especially effective for long representations like linearized syntax. Our approach gives state-of-the-art performance on a difficult Japanese-English task.

pdf bib
Why not be Versatile? Applications of the SGNMT Decoder for Machine Translation
Felix Stahlberg | Danielle Saunders | Gonzalo Iglesias | Bill Byrne
Proceedings of the 13th Conference of the Association for Machine Translation in the Americas (Volume 1: Research Track)

pdf bib
An Operation Sequence Model for Explainable Neural Machine Translation
Felix Stahlberg | Danielle Saunders | Bill Byrne
Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP

We propose to achieve explainable neural machine translation (NMT) by changing the output representation to explain itself. We present a novel approach to NMT which generates the target sentence by monotonically walking through the source sentence. Word reordering is modeled by operations which allow setting markers in the target sentence and move a target-side write head between those markers. In contrast to many modern neural models, our system emits explicit word alignment information which is often crucial to practical machine translation as it improves explainability. Our technique can outperform a plain text system in terms of BLEU score under the recent Transformer architecture on Japanese-English and Portuguese-English, and is within 0.5 BLEU difference on Spanish-English.

pdf bib
The University of Cambridge’s Machine Translation Systems for WMT18
Felix Stahlberg | Adrià de Gispert | Bill Byrne
Proceedings of the Third Conference on Machine Translation: Shared Task Papers

The University of Cambridge submission to the WMT18 news translation task focuses on the combination of diverse models of translation. We compare recurrent, convolutional, and self-attention-based neural models on German-English, English-German, and Chinese-English. Our final system combines all neural models together with a phrase-based SMT system in an MBR-based scheme. We report small but consistent gains on top of strong Transformer ensembles.

2017

pdf bib
Neural Machine Translation by Minimising the Bayes-risk with Respect to Syntactic Translation Lattices
Felix Stahlberg | Adrià de Gispert | Eva Hasler | Bill Byrne
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers

We present a novel scheme to combine neural machine translation (NMT) with traditional statistical machine translation (SMT). Our approach borrows ideas from linearised lattice minimum Bayes-risk decoding for SMT. The NMT score is combined with the Bayes-risk of the translation according the SMT lattice. This makes our approach much more flexible than n-best list or lattice rescoring as the neural decoder is not restricted to the SMT search space. We show an efficient and simple way to integrate risk estimation into the NMT decoder which is suitable for word-level as well as subword-unit-level NMT. We test our method on English-German and Japanese-English and report significant gains over lattice rescoring on several data sets for both single and ensembled NMT. The MBR decoder produces entirely new hypotheses far beyond simply rescoring the SMT search space or fixing UNKs in the NMT output.

pdf bib
A Comparison of Neural Models for Word Ordering
Eva Hasler | Felix Stahlberg | Marcus Tomalin | Adrià de Gispert | Bill Byrne
Proceedings of the 10th International Conference on Natural Language Generation

We compare several language models for the word-ordering task and propose a new bag-to-sequence neural model based on attention-based sequence-to-sequence models. We evaluate the model on a large German WMT data set where it significantly outperforms existing models. We also describe a novel search strategy for LM-based word ordering and report results on the English Penn Treebank. Our best model setup outperforms prior work both in terms of speed and quality.

pdf bib
Unfolding and Shrinking Neural Machine Translation Ensembles
Felix Stahlberg | Bill Byrne
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

Ensembling is a well-known technique in neural machine translation (NMT) to improve system performance. Instead of a single neural net, multiple neural nets with the same topology are trained separately, and the decoder generates predictions by averaging over the individual models. Ensembling often improves the quality of the generated translations drastically. However, it is not suitable for production systems because it is cumbersome and slow. This work aims to reduce the runtime to be on par with a single system without compromising the translation quality. First, we show that the ensemble can be unfolded into a single large neural network which imitates the output of the ensemble system. We show that unfolding can already improve the runtime in practice since more work can be done on the GPU. We proceed by describing a set of techniques to shrink the unfolded network by reducing the dimensionality of layers. On Japanese-English we report that the resulting network has the size and decoding speed of a single NMT network but performs on the level of a 3-ensemble system.

pdf bib
Break it Down for Me: A Study in Automated Lyric Annotation
Lucas Sterckx | Jason Naradowsky | Bill Byrne | Thomas Demeester | Chris Develder
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

Comprehending lyrics, as found in songs and poems, can pose a challenge to human and machine readers alike. This motivates the need for systems that can understand the ambiguity and jargon found in such creative texts, and provide commentary to aid readers in reaching the correct interpretation. We introduce the task of automated lyric annotation (ALA). Like text simplification, a goal of ALA is to rephrase the original text in a more easily understandable manner. However, in ALA the system must often include additional information to clarify niche terminology and abstract concepts. To stimulate research on this task, we release a large collection of crowdsourced annotations for song lyrics. We analyze the performance of translation and retrieval models on this task, measuring performance with both automated and human evaluation. We find that each model captures a unique type of information important to the task.

pdf bib
SGNMT – A Flexible NMT Decoding Platform for Quick Prototyping of New Models and Search Strategies
Felix Stahlberg | Eva Hasler | Danielle Saunders | Bill Byrne
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing: System Demonstrations

This paper introduces SGNMT, our experimental platform for machine translation research. SGNMT provides a generic interface to neural and symbolic scoring modules (predictors) with left-to-right semantic such as translation models like NMT, language models, translation lattices, n-best lists or other kinds of scores and constraints. Predictors can be combined with other predictors to form complex decoding tasks. SGNMT implements a number of search strategies for traversing the space spanned by the predictors which are appropriate for different predictor constellations. Adding new predictors or decoding strategies is particularly easy, making it a very efficient tool for prototyping new research ideas. SGNMT is actively being used by students in the MPhil program in Machine Learning, Speech and Language Technology at the University of Cambridge for course work and theses, as well as for most of the research work in our group.

2016

pdf bib
Speed-Constrained Tuning for Statistical Machine Translation Using Bayesian Optimization
Daniel Beck | Adrià de Gispert | Gonzalo Iglesias | Aurelien Waite | Bill Byrne
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Syntactically Guided Neural Machine Translation
Felix Stahlberg | Eva Hasler | Aurelien Waite | Bill Byrne
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
The Edit Distance Transducer in Action: The University of Cambridge English-German System at WMT16
Felix Stahlberg | Eva Hasler | Bill Byrne
Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers

2015

pdf bib
Transducer Disambiguation with Sparse Topological Features
Gonzalo Iglesias | Adrià de Gispert | Bill Byrne
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
Hierarchical Statistical Semantic Realization for Minimal Recursion Semantics
Matic Horvat | Ann Copestake | Bill Byrne
Proceedings of the 11th International Conference on Computational Semantics

pdf bib
The Geometry of Statistical Machine Translation
Aurelien Waite | Bill Byrne
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Fast and Accurate Preordering for SMT using Neural Networks
Adrià de Gispert | Gonzalo Iglesias | Bill Byrne
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

2014

pdf bib
Pushdown Automata in Statistical Machine Translation
Cyril Allauzen | Bill Byrne | Adrià de Gispert | Gonzalo Iglesias | Michael Riley
Computational Linguistics, Volume 40, Issue 3 - September 2014

pdf bib
Proceedings of the ACL 2014 Student Research Workshop
Ekaterina Kochmar | Annie Louis | Svitlana Volkova | Jordan Boyd-Graber | Bill Byrne
Proceedings of the ACL 2014 Student Research Workshop

pdf bib
Source-side Preordering for Translation using Logistic Regression and Depth-first Branch-and-Bound Search
Laura Jehl | Adrià de Gispert | Mark Hopkins | Bill Byrne
Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics

pdf bib
Word Ordering with Phrase-Based Grammars
Adrià de Gispert | Marcus Tomalin | Bill Byrne
Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics

pdf bib
Effective Incorporation of Source Syntax into Hierarchical Phrase-based Translation
Tong Xiao | Adrià de Gispert | Jingbo Zhu | Bill Byrne
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers