Kenji Imamura


2019

pdf bib
Long Warm-up and Self-Training: Training Strategies of NICT-2 NMT System at WAT-2019
Kenji Imamura | Eiichiro Sumita
Proceedings of the 6th Workshop on Asian Translation

This paper describes the NICT-2 neural machine translation system at the 6th Workshop on Asian Translation. This system employs the standard Transformer model but features the following two characteristics. One is the long warm-up strategy, which performs a longer warm-up of the learning rate at the start of the training than conventional approaches. Another is that the system introduces self-training approaches based on multiple back-translations generated by sampling. We participated in three tasks—ASPEC.en-ja, ASPEC.ja-en, and TDDC.ja-en—using this system.

pdf bib
Recycling a Pre-trained BERT Encoder for Neural Machine Translation
Kenji Imamura | Eiichiro Sumita
Proceedings of the 3rd Workshop on Neural Generation and Translation

In this paper, a pre-trained Bidirectional Encoder Representations from Transformers (BERT) model is applied to Transformer-based neural machine translation (NMT). In contrast to monolingual tasks, the number of unlearned model parameters in an NMT decoder is as huge as the number of learned parameters in the BERT model. To train all the models appropriately, we employ two-stage optimization, which first trains only the unlearned parameters by freezing the BERT model, and then fine-tunes all the sub-models. In our experiments, stable two-stage optimization was achieved, in contrast the BLEU scores of direct fine-tuning were extremely low. Consequently, the BLEU scores of the proposed method were better than those of the Transformer base model and the same model without pre-training. Additionally, we confirmed that NMT with the BERT encoder is more effective in low-resource settings.

pdf bib
Exploiting Out-of-Domain Parallel Data through Multilingual Transfer Learning for Low-Resource Neural Machine Translation
Aizhan Imankulova | Raj Dabre | Atsushi Fujita | Kenji Imamura
Proceedings of Machine Translation Summit XVII Volume 1: Research Track

2018

pdf bib
Multilingual Parallel Corpus for Global Communication Plan
Kenji Imamura | Eiichiro Sumita
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

pdf bib
Enhancement of Encoder and Attention Using Target Monolingual Corpora in Neural Machine Translation
Kenji Imamura | Atsushi Fujita | Eiichiro Sumita
Proceedings of the 2nd Workshop on Neural Machine Translation and Generation

A large-scale parallel corpus is required to train encoder-decoder neural machine translation. The method of using synthetic parallel texts, in which target monolingual corpora are automatically translated into source sentences, is effective in improving the decoder, but is unreliable for enhancing the encoder. In this paper, we propose a method that enhances the encoder and attention using target monolingual corpora by generating multiple source sentences via sampling. By using multiple source sentences, diversity close to that of humans is achieved. Our experimental results show that the translation quality is improved by increasing the number of synthetic source sentences for each given target sentence, and quality close to that using a manually created parallel corpus was achieved.

pdf bib
NICT Self-Training Approach to Neural Machine Translation at NMT-2018
Kenji Imamura | Eiichiro Sumita
Proceedings of the 2nd Workshop on Neural Machine Translation and Generation

This paper describes the NICT neural machine translation system submitted at the NMT-2018 shared task. A characteristic of our approach is the introduction of self-training. Since our self-training does not change the model structure, it does not influence the efficiency of translation, such as the translation speed. The experimental results showed that the translation quality improved not only in the sequence-to-sequence (seq-to-seq) models but also in the transformer models.

2017

pdf bib
Ensemble and Reranking: Using Multiple Models in the NICT-2 Neural Machine Translation System at WAT2017
Kenji Imamura | Eiichiro Sumita
Proceedings of the 4th Workshop on Asian Translation (WAT2017)

In this paper, we describe the NICT-2 neural machine translation system evaluated at WAT2017. This system uses multiple models as an ensemble and combines models with opposite decoding directions by reranking (called bi-directional reranking). In our experimental results on small data sets, the translation quality improved when the number of models was increased to 32 in total and did not saturate. In the experiments on large data sets, improvements of 1.59-3.32 BLEU points were achieved when six-model ensembles were combined by the bi-directional reranking.

2016

pdf bib
NICT-2 Translation System for WAT2016: Applying Domain Adaptation to Phrase-based Statistical Machine Translation
Kenji Imamura | Eiichiro Sumita
Proceedings of the 3rd Workshop on Asian Translation (WAT2016)

This paper describes the NICT-2 translation system for the 3rd Workshop on Asian Translation. The proposed system employs a domain adaptation method based on feature augmentation. We regarded the Japan Patent Office Corpus as a mixture of four domain corpora and improved the translation quality of each domain. In addition, we incorporated language models constructed from Google n-grams as external knowledge. Our domain adaptation method can naturally incorporate such external knowledge that contributes to translation quality.

2014

pdf bib
Predicate-Argument Structure Analysis with Zero-Anaphora Resolution for Dialogue Systems
Kenji Imamura | Ryuichiro Higashinaka | Tomoko Izumi
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

pdf bib
Towards an open-domain conversational system fully based on natural language processing
Ryuichiro Higashinaka | Kenji Imamura | Toyomi Meguro | Chiaki Miyazaki | Nozomi Kobayashi | Hiroaki Sugiyama | Toru Hirano | Toshiro Makino | Yoshihiro Matsuo
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

2013

pdf bib
Case Study of Model Adaptation: Transfer Learning and Online Learning
Kenji Imamura
Proceedings of the Sixth International Joint Conference on Natural Language Processing

2012

pdf bib
Grammar Error Correction Using Pseudo-Error Sentences and Domain Adaptation
Kenji Imamura | Kuniko Saito | Kugatsu Sadamitsu | Hitoshi Nishikawa
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
Constructing a Class-Based Lexical Dictionary using Interactive Topic Models
Kugatsu Sadamitsu | Kuniko Saito | Kenji Imamura | Yoshihiro Matsuo
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

This paper proposes a new method of constructing arbitrary class-based related word dictionaries on interactive topic models; we assume that each class is described by a topic. We propose a new semi-supervised method that uses the simplest topic model yielded by the standard EM algorithm; model calculation is very rapid. Furthermore our approach allows a dictionary to be modified interactively and the final dictionary has a hierarchical structure. This paper makes three contributions. First, it proposes a word-based semi-supervised topic model. Second, we apply the semi-supervised topic model to interactive learning; this approach is called the Interactive Topic Model. Third, we propose a score function; it extracts the related words that occupy the middle layer of the hierarchical structure. Experiments show that our method can appropriately retrieve the words belonging to an arbitrary class.

pdf bib
Entity Set Expansion using Interactive Topic Information
Kugatsu Sadamitsu | Kuniko Saito | Kenji Imamura | Yoshihiro Matsuo
Proceedings of the 26th Pacific Asia Conference on Language, Information, and Computation

2011

pdf bib
Entity Set Expansion using Topic information
Kugatsu Sadamitsu | Kuniko Saito | Kenji Imamura | Genichiro Kikui
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

2010

pdf bib
Standardizing Complex Functional Expressions in Japanese Predicates: Applying Theoretically-Based Paraphrasing Rules
Tomoko Izumi | Kenji Imamura | Genichiro Kikui | Satoshi Sato
Proceedings of the 2010 Workshop on Multiword Expressions: from Theory to Applications

2009

pdf bib
Tag Confidence Measure for Semi-Automatically Updating Named Entity Recognition
Kuniko Saito | Kenji Imamura
Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration (NEWS 2009)

pdf bib
Discriminative Approach to Predicate-Argument Structure Analysis with Zero-Anaphora Resolution
Kenji Imamura | Kuniko Saito | Tomoko Izumi
Proceedings of the ACL-IJCNLP 2009 Conference Short Papers

2007

pdf bib
Japanese Dependency Parsing Using Sequential Labeling for Semi-spoken Language
Kenji Imamura | Genichiro Kikui | Norihito Yasuda
Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics Companion Volume Proceedings of the Demo and Poster Sessions

2004

pdf bib
Example-based Machine Translation Based on Syntactic Transfer with Statistical Models
Kenji Imamura | Hideo Okuma | Taro Watanabe | Eiichiro Sumita
COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics

2003

pdf bib
Automatic Construction of Machine Translation Knowledge Using Translation Literalness
Kenji Imamura | Eiichiro Sumita | Yuji Matsumoto
10th Conference of the European Chapter of the Association for Computational Linguistics

pdf bib
A corpus-centered approach to spoken language translation
Eiichiro Sumita | Yasuhiro Akiba | Takao Doi | Andrew Finch | Kenji Imamura | Michael Paul | Mitsuo Shimohata | Taro Watanabe
10th Conference of the European Chapter of the Association for Computational Linguistics

pdf bib
Feedback Cleaning of Machine Translation Rules Using Automatic Evaluation
Kenji Imamura | Eiichiro Sumita | Yuji Matsumoto
Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics

pdf bib
Automatic Expansion of Equivalent Sentence Set Based on Syntactic Substitution
Kenji Imamura | Yasuhiro Akiba | Eiichiro Sumita
Companion Volume of the Proceedings of HLT-NAACL 2003 - Short Papers

2002

pdf bib
Comparing and Extracting Paraphrasing Words with 2-Way Bilingual Dictionaries
Kazutaka Takao | Kenji Imamura | Hideki Kashioka
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02)

2000

pdf bib
Taking Account of the User’s View in 3D Multimodal Instruction Dialogue
Yukiko I. Nakano | Kenji Imamura | Hisashi Ohara
COLING 2000 Volume 1: The 18th International Conference on Computational Linguistics