Andrew Finch


2020

pdf bib
Proceedings of the Fourth Workshop on Neural Generation and Translation
Alexandra Birch | Andrew Finch | Hiroaki Hayashi | Kenneth Heafield | Marcin Junczys-Dowmunt | Ioannis Konstas | Xian Li | Graham Neubig | Yusuke Oda
Proceedings of the Fourth Workshop on Neural Generation and Translation

pdf bib
Findings of the Fourth Workshop on Neural Generation and Translation
Kenneth Heafield | Hiroaki Hayashi | Yusuke Oda | Ioannis Konstas | Andrew Finch | Graham Neubig | Xian Li | Alexandra Birch
Proceedings of the Fourth Workshop on Neural Generation and Translation

We describe the finding of the Fourth Workshop on Neural Generation and Translation, held in concert with the annual conference of the Association for Computational Linguistics (ACL 2020). First, we summarize the research trends of papers presented in the proceedings. Second, we describe the results of the three shared tasks 1) efficient neural machine translation (NMT) where participants were tasked with creating NMT systems that are both accurate and efficient, and 2) document-level generation and translation (DGT) where participants were tasked with developing systems that generate summaries from structured data, potentially with assistance from text in another language and 3) STAPLE task: creation of as many possible translations of a given input text. This last shared task was organised by Duolingo.

2019

pdf bib
Proceedings of the 3rd Workshop on Neural Generation and Translation
Alexandra Birch | Andrew Finch | Hiroaki Hayashi | Ioannis Konstas | Thang Luong | Graham Neubig | Yusuke Oda | Katsuhito Sudoh
Proceedings of the 3rd Workshop on Neural Generation and Translation

pdf bib
Findings of the Third Workshop on Neural Generation and Translation
Hiroaki Hayashi | Yusuke Oda | Alexandra Birch | Ioannis Konstas | Andrew Finch | Minh-Thang Luong | Graham Neubig | Katsuhito Sudoh
Proceedings of the 3rd Workshop on Neural Generation and Translation

This document describes the findings of the Third Workshop on Neural Generation and Translation, held in concert with the annual conference of the Empirical Methods in Natural Language Processing (EMNLP 2019). First, we summarize the research trends of papers presented in the proceedings. Second, we describe the results of the two shared tasks 1) efficient neural machine translation (NMT) where participants were tasked with creating NMT systems that are both accurate and efficient, and 2) document generation and translation (DGT) where participants were tasked with developing systems that generate summaries from structured data, potentially with assistance from text in another language.

2018

pdf bib
Proceedings of the 2nd Workshop on Neural Machine Translation and Generation
Alexandra Birch | Andrew Finch | Thang Luong | Graham Neubig | Yusuke Oda
Proceedings of the 2nd Workshop on Neural Machine Translation and Generation

pdf bib
Findings of the Second Workshop on Neural Machine Translation and Generation
Alexandra Birch | Andrew Finch | Minh-Thang Luong | Graham Neubig | Yusuke Oda
Proceedings of the 2nd Workshop on Neural Machine Translation and Generation

This document describes the findings of the Second Workshop on Neural Machine Translation and Generation, held in concert with the annual conference of the Association for Computational Linguistics (ACL 2018). First, we summarize the research trends of papers presented in the proceedings, and note that there is particular interest in linguistic structure, domain adaptation, data augmentation, handling inadequate resources, and analysis of models. Second, we describe the results of the workshop’s shared task on efficient neural machine translation, where participants were tasked with creating MT systems that are both accurate and efficient.

2017

pdf bib
Sentence Embedding for Neural Machine Translation Domain Adaptation
Rui Wang | Andrew Finch | Masao Utiyama | Eiichiro Sumita
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Although new corpora are becoming increasingly available for machine translation, only those that belong to the same or similar domains are typically able to improve translation performance. Recently Neural Machine Translation (NMT) has become prominent in the field. However, most of the existing domain adaptation methods only focus on phrase-based machine translation. In this paper, we exploit the NMT’s internal embedding of the source sentence and use the sentence embedding similarity to select the sentences which are close to in-domain data. The empirical adaptation results on the IWSLT English-French and NIST Chinese-English tasks show that the proposed methods can substantially improve NMT performance by 2.4-9.0 BLEU points, outperforming the existing state-of-the-art baseline by 2.3-4.5 BLEU points.

pdf bib
Proceedings of the First Workshop on Neural Machine Translation
Thang Luong | Alexandra Birch | Graham Neubig | Andrew Finch
Proceedings of the First Workshop on Neural Machine Translation

2016

pdf bib
Introducing the Asian Language Treebank (ALT)
Ye Kyaw Thu | Win Pa Pa | Masao Utiyama | Andrew Finch | Eiichiro Sumita
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

This paper introduces the ALT project initiated by the Advanced Speech Translation Research and Development Promotion Center (ASTREC), NICT, Kyoto, Japan. The aim of this project is to accelerate NLP research for Asian languages such as Indonesian, Japanese, Khmer, Laos, Malay, Myanmar, Philippine, Thai and Vietnamese. The original resource for this project was English articles that were randomly selected from Wikinews. The project has so far created a corpus for Myanmar and will extend in scope to include other languages in the near future. A 20000-sentence corpus of Myanmar that has been manually translated from an English corpus has been word segmented, word aligned, part-of-speech tagged and constituency parsed by human annotators. In this paper, we present the implementation steps for creating the treebank in detail, including a description of the ALT web-based treebanking tool. Moreover, we report statistics on the annotation quality of the Myanmar treebank created so far.

pdf bib
Agreement on Target-bidirectional Neural Machine Translation
Lemao Liu | Masao Utiyama | Andrew Finch | Eiichiro Sumita
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Interlocking Phrases in Phrase-based Statistical Machine Translation
Ye Kyaw Thu | Andrew Finch | Eiichiro Sumita
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Neural Machine Translation with Supervised Attention
Lemao Liu | Masao Utiyama | Andrew Finch | Eiichiro Sumita
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

The attention mechanism is appealing for neural machine translation, since it is able to dynamically encode a source sentence by generating a alignment between a target word and source words. Unfortunately, it has been proved to be worse than conventional alignment models in alignment accuracy. In this paper, we analyze and explain this issue from the point view of reordering, and propose a supervised attention which is learned with guidance from conventional alignment models. Experiments on two Chinese-to-English translation tasks show that the supervised attention mechanism yields better alignments leading to substantial gains over the standard attention based NMT.

pdf bib
A Prototype Automatic Simultaneous Interpretation System
Xiaolin Wang | Andrew Finch | Masao Utiyama | Eiichiro Sumita
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: System Demonstrations

Simultaneous interpretation allows people to communicate spontaneously across language boundaries, but such services are prohibitively expensive for the general public. This paper presents a fully automatic simultaneous interpretation system to address this problem. Though the development is still at an early stage, the system is capable of keeping up with the fastest of the TED speakers while at the same time delivering high-quality translations. We believe that the system will become an effective tool for facilitating cross-lingual communication in the future.

pdf bib
Target-Bidirectional Neural Models for Machine Transliteration
Andrew Finch | Lemao Liu | Xiaolin Wang | Eiichiro Sumita
Proceedings of the Sixth Named Entity Workshop

pdf bib
An Efficient and Effective Online Sentence Segmenter for Simultaneous Interpretation
Xiaolin Wang | Andrew Finch | Masao Utiyama | Eiichiro Sumita
Proceedings of the 3rd Workshop on Asian Translation (WAT2016)

Simultaneous interpretation is a very challenging application of machine translation in which the input is a stream of words from a speech recognition engine. The key problem is how to segment the stream in an online manner into units suitable for translation. The segmentation process proceeds by calculating a confidence score for each word that indicates the soundness of placing a sentence boundary after it, and then heuristics are employed to determine the position of the boundaries. Multiple variants of the confidence scoring method and segmentation heuristics were studied. Experimental results show that the best performing strategy is not only efficient in terms of average latency per word, but also achieved end-to-end translation quality close to an offline baseline, and close to oracle segmentation.

2015

pdf bib
Hierarchical Phrase-based Stream Decoding
Andrew Finch | Xiaolin Wang | Masao Utiyama | Eiichiro Sumita
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
Leave-one-out Word Alignment without Garbage Collector Effects
Xiaolin Wang | Masao Utiyama | Andrew Finch | Taro Watanabe | Eiichiro Sumita
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
Neural Network Transduction Models in Transliteration Generation
Andrew Finch | Lemao Liu | Xiaolin Wang | Eiichiro Sumita
Proceedings of the Fifth Named Entity Workshop

pdf bib
A Large-scale Study of Statistical Machine Translation Methods for Khmer Language
Ye Kyaw Thu | Vichet Chea | Andrew Finch | Masao Utiyama | Eiichiro Sumita
Proceedings of the 29th Pacific Asia Conference on Language, Information and Computation

2014

pdf bib
Empirical Study of Unsupervised Chinese Word Segmentation Methods for SMT on Large-scale Corpora
Xiaolin Wang | Masao Utiyama | Andrew Finch | Eiichiro Sumita
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
Integrating Dictionaries into an Unsupervised Model for Myanmar Word Segmentation
Ye Kyaw Thu | Andrew Finch | Eiichiro Sumita | Yoshinori Sagisaka
Proceedings of the Fifth Workshop on South and Southeast Asian Natural Language Processing

pdf bib
Refining Word Segmentation Using a Manually Aligned Corpus for Statistical Machine Translation
Xiaolin Wang | Masao Utiyama | Andrew Finch | Eiichiro Sumita
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

2013

pdf bib
A Tightly-coupled Unsupervised Clustering and Bilingual Alignment Model for Transliteration
Tingting Li | Tiejun Zhao | Andrew Finch | Chunyue Zhang
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

2012

pdf bib
Rescoring a Phrase-based Machine Transliteration System with Recurrent Neural Network Language Models
Andrew Finch | Paul Dixon | Eiichiro Sumita
Proceedings of the 4th Named Entity Workshop (NEWS) 2012

2011

pdf bib
Dialect Translation: Integrating Bayesian Co-segmentation Models with Pivot-based SMT
Michael Paul | Andrew Finch | Paul R. Dixon | Eiichiro Sumita
Proceedings of the First Workshop on Algorithms and Resources for Modelling of Dialects and Language Varieties

pdf bib
Integrating Models Derived from non-Parametric Bayesian Co-segmentation into a Statistical Machine Transliteration System
Andrew Finch | Paul Dixon | Eiichiro Sumita
Proceedings of the 3rd Named Entities Workshop (NEWS 2011)

pdf bib
Using Features from a Bilingual Alignment Model in Transliteration Mining
Takaaki Fukunishi | Andrew Finch | Seiichi Yamamoto | Eiichiro Sumita
Proceedings of the 3rd Named Entities Workshop (NEWS 2011)

2010

pdf bib
Integration of Multiple Bilingually-Learned Segmentation Schemes into Statistical Machine Translation
Michael Paul | Andrew Finch | Eiichiro Sumita
Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR

pdf bib
Transliteration Using a Phrase-Based Statistical Machine Translation System to Re-Score the Output of a Joint Multigram Model
Andrew Finch | Eiichiro Sumita
Proceedings of the 2010 Named Entities Workshop

pdf bib
Syntactic Constraints on Phrase Extraction for Phrase-Based Machine Translation
Hailong Cao | Andrew Finch | Eiichiro Sumita
Proceedings of the 4th Workshop on Syntax and Structure in Statistical Translation

2009

pdf bib
Bidirectional Phrase-based Statistical Machine Translation
Andrew Finch | Eiichiro Sumita
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

pdf bib
NICT@WMT09: Model Adaptation and Transliteration for Spanish-English SMT
Michael Paul | Andrew Finch | Eiichiro Sumita
Proceedings of the Fourth Workshop on Statistical Machine Translation

pdf bib
Transliteration by Bidirectional Statistical Machine Translation
Andrew Finch | Eiichiro Sumita
Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration (NEWS 2009)

2008

pdf bib
Phrase-based Machine Transliteration
Andrew Finch | Eiichiro Sumita
Proceedings of the Workshop on Technologies and Corpora for Asia-Pacific Speech Translation (TCAST)

pdf bib
Dynamic Model Interpolation for Statistical Machine Translation
Andrew Finch | Eiichiro Sumita
Proceedings of the Third Workshop on Statistical Machine Translation

2006

pdf bib
Using Lexical Dependency and Ontological Knowledge to Improve a Detailed Syntactic and Semantic Tagger of English
Andrew Finch | Ezra Black | Young-Sook Hwang | Eiichiro Sumita
Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions

2005

pdf bib
Using Machine Translation Evaluation Techniques to Determine Sentence-level Semantic Equivalence
Andrew Finch | Young-Sook Hwang | Eiichiro Sumita
Proceedings of the Third International Workshop on Paraphrasing (IWP2005)

2004

pdf bib
How Does Automatic Machine Translation Evaluation Correlate with Human Scoring as the Number of Reference Translations Increases?
Andrew Finch | Yasuhiro Akiba | Eiichiro Sumita
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

2003

pdf bib
A corpus-centered approach to spoken language translation
Eiichiro Sumita | Yasuhiro Akiba | Takao Doi | Andrew Finch | Kenji Imamura | Michael Paul | Mitsuo Shimohata | Taro Watanabe
10th Conference of the European Chapter of the Association for Computational Linguistics

2002

pdf bib
Beyond Tag Trigrams: New Local Features for Tagging
Andrew Finch | Ezra Black | Ringo Wathelet
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02)

1999

pdf bib
Applying Extrasentential Context To Maximum Entropy Based Tagging With A Large Semantic And Syntactic Tagset
Ezra Black | Andrew Finch | Ruiqiang Zhang
1999 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora

1998

pdf bib
Trigger-Pair Predictors in Parsing and Tagging
Ezra Black | Andrew Finch | Hideki Kashioka
COLING 1998 Volume 1: The 17th International Conference on Computational Linguistics

pdf bib
Use of Mutual Information Based Character Clusters in Dictionary-less Morphological Analysis of Japanese
Hideki Kashioka | Yasuhiro Kawata | Yumiko Kinjo | Andrew Finch | Ezra W. Black
COLING 1998 Volume 1: The 17th International Conference on Computational Linguistics

pdf bib
Trigger-Pair Predictors in Parsing and Tagging
Ezra Black | Andrew Finch | Hideki Kashioka
36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Volume 1

pdf bib
Use of Mutual Information Based Character Clusters in Dictionary-less Morphological Analysis of Japanese
Hideki Kashioka | Yasuhiro Kawata | Yumiko Kinjo | Andrew Finch | Ezra W. Black
36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Volume 1