Jeremy Gwinnup


2020

pdf bib
The AFRL IWSLT 2020 Systems: Work-From-Home Edition
Brian Ore | Eric Hansen | Tim Anderson | Jeremy Gwinnup
Proceedings of the 17th International Conference on Spoken Language Translation

This report summarizes the Air Force Research Laboratory (AFRL) submission to the offline spoken language translation (SLT) task as part of the IWSLT 2020 evaluation campaign. As in previous years, we chose to adopt the cascade approach of using separate systems to perform speech activity detection, automatic speech recognition, sentence segmentation, and machine translation. All systems were neural based, including a fully-connected neural network for speech activity detection, a Kaldi factorized time delay neural network with recurrent neural network (RNN) language model rescoring for speech recognition, a bidirectional RNN with attention mechanism for sentence segmentation, and transformer networks trained with OpenNMT and Marian for machine translation. Our primary submission yielded BLEU scores of 21.28 on tst2019 and 23.33 on tst2020.

2019

pdf bib
The AFRL WMT19 Systems: Old Favorites and New Tricks
Jeremy Gwinnup | Grant Erdmann | Tim Anderson
Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)

This paper describes the Air Force Research Laboratory (AFRL) machine translation systems and the improvements that were developed during the WMT19 evaluation campaign. This year, we refine our approach to training popular neural machine translation toolkits, experiment with a new domain adaptation technique and again measure improvements in performance on the Russian–English language pair.

pdf bib
Quality and Coverage: The AFRL Submission to the WMT19 Parallel Corpus Filtering for Low-Resource Conditions Task
Grant Erdmann | Jeremy Gwinnup
Proceedings of the Fourth Conference on Machine Translation (Volume 3: Shared Task Papers, Day 2)

The WMT19 Parallel Corpus Filtering For Low-Resource Conditions Task aims to test various methods of filtering a noisy parallel corpora, to make them useful for training machine translation systems. This year the noisy corpora are the relatively low-resource language pairs of Nepali-English and Sinhala-English. This papers describes the Air Force Research Laboratory (AFRL) submissions, including preprocessing methods and scoring metrics. Numerical results indicate a benefit over baseline and the relative benefits of different options.

pdf bib
Overcoming Catastrophic Forgetting During Domain Adaptation of Neural Machine Translation
Brian Thompson | Jeremy Gwinnup | Huda Khayrallah | Kevin Duh | Philipp Koehn
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Continued training is an effective method for domain adaptation in neural machine translation. However, in-domain gains from adaptation come at the expense of general-domain performance. In this work, we interpret the drop in general-domain performance as catastrophic forgetting of general-domain knowledge. To mitigate it, we adapt Elastic Weight Consolidation (EWC)—a machine learning method for learning a new task without forgetting previous tasks. Our method retains the majority of general-domain performance lost in continued training without degrading in-domain performance, outperforming the previous state-of-the-art. We also explore the full range of general-domain performance available when some in-domain degradation is acceptable.

2018

pdf bib
Freezing Subnetworks to Analyze Domain Adaptation in Neural Machine Translation
Brian Thompson | Huda Khayrallah | Antonios Anastasopoulos | Arya D. McCarthy | Kevin Duh | Rebecca Marvin | Paul McNamee | Jeremy Gwinnup | Tim Anderson | Philipp Koehn
Proceedings of the Third Conference on Machine Translation: Research Papers

To better understand the effectiveness of continued training, we analyze the major components of a neural machine translation system (the encoder, decoder, and each embedding space) and consider each component’s contribution to, and capacity for, domain adaptation. We find that freezing any single component during continued training has minimal impact on performance, and that performance is surprisingly good when a single component is adapted while holding the rest of the model fixed. We also find that continued training does not move the model very far from the out-of-domain model, compared to a sensitivity analysis metric, suggesting that the out-of-domain model can provide a good generic initialization for the new domain.

pdf bib
The AFRL WMT18 Systems: Ensembling, Continuation and Combination
Jeremy Gwinnup | Tim Anderson | Grant Erdmann | Katherine Young
Proceedings of the Third Conference on Machine Translation: Shared Task Papers

This paper describes the Air Force Research Laboratory (AFRL) machine translation systems and the improvements that were developed during the WMT18 evaluation campaign. This year, we examined the developments and additions to popular neural machine translation toolkits and measure improvements in performance on the Russian–English language pair.

pdf bib
The AFRL-Ohio State WMT18 Multimodal System: Combining Visual with Traditional
Jeremy Gwinnup | Joshua Sandvick | Michael Hutt | Grant Erdmann | John Duselis | James Davis
Proceedings of the Third Conference on Machine Translation: Shared Task Papers

AFRL-Ohio State extends its usage of visual domain-driven machine translation for use as a peer with traditional machine translation systems. As a peer, it is enveloped into a system combination of neural and statistical MT systems to present a composite translation.

pdf bib
Coverage and Cynicism: The AFRL Submission to the WMT 2018 Parallel Corpus Filtering Task
Grant Erdmann | Jeremy Gwinnup
Proceedings of the Third Conference on Machine Translation: Shared Task Papers

The WMT 2018 Parallel Corpus Filtering Task aims to test various methods of filtering a noisy parallel corpus, to make it useful for training machine translation systems. We describe the AFRL submissions, including their preprocessing methods and quality metrics. Numerical results indicate relative benefits of different options and show where our methods are competitive.

2017

pdf bib
The AFRL-MITLL WMT17 Systems: Old, New, Borrowed, BLEU
Jeremy Gwinnup | Timothy Anderson | Grant Erdmann | Katherine Young | Michaeel Kazi | Elizabeth Salesky | Brian Thompson | Jonathan Taylor
Proceedings of the Second Conference on Machine Translation

pdf bib
The AFRL-OSU WMT17 Multimodal Translation System: An Image Processing Approach
John Duselis | Michael Hutt | Jeremy Gwinnup | James Davis | Joshua Sandvick
Proceedings of the Second Conference on Machine Translation

pdf bib
The AFRL WMT17 Neural Machine Translation Training Task Submission
Grant Erdmann | Katherine Young | Jeremy Gwinnup
Proceedings of the Second Conference on Machine Translation

2016

pdf bib
The AFRL-MITLL WMT16 News-Translation Task Systems
Jeremy Gwinnup | Tim Anderson | Grant Erdmann | Katherine Young | Michaeel Kazi | Elizabeth Salesky | Brian Thompson
Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers

2015

pdf bib
The AFRL-MITLL WMT15 System: There’s More than One Way to Decode It!
Jeremy Gwinnup | Tim Anderson | Grant Erdmann | Katherine Young | Christina May | Michaeel Kazi | Elizabeth Salesky | Brian Thompson
Proceedings of the Tenth Workshop on Statistical Machine Translation

pdf bib
Drem: The AFRL Submission to the WMT15 Tuning Task
Grant Erdmann | Jeremy Gwinnup
Proceedings of the Tenth Workshop on Statistical Machine Translation

2014

pdf bib
Machine Translation and Monolingual Postediting: The AFRL WMT-14 System
Lane Schwartz | Timothy Anderson | Jeremy Gwinnup | Katherine Young
Proceedings of the Ninth Workshop on Statistical Machine Translation