Hideya Mino


2020

pdf bib
Content-Equivalent Translated Parallel News Corpus and Extension of Domain Adaptation for NMT
Hideya Mino | Hideki Tanaka | Hitoshi Ito | Isao Goto | Ichiro Yamada | Takenobu Tokunaga
Proceedings of the 12th Language Resources and Evaluation Conference

In this paper, we deal with two problems in Japanese-English machine translation of news articles. The first problem is the quality of parallel corpora. Neural machine translation (NMT) systems suffer degraded performance when trained with noisy data. Because there is no clean Japanese-English parallel data for news articles, we build a novel parallel news corpus consisting of Japanese news articles translated into English in a content-equivalent manner. This is the first content-equivalent Japanese-English news corpus translated specifically for training NMT systems. The second problem involves the domain-adaptation technique. NMT systems suffer degraded performance when trained with mixed data having different features, such as noisy data and clean data. Though the existing methods try to overcome this problem by using tags for distinguishing the differences between corpora, it is not sufficient. We thus extend a domain-adaptation method using multi-tags to train an NMT model effectively with the clean corpus and existing parallel news corpora with some types of noise. Experimental results show that our corpus increases the translation quality, and that our domain-adaptation method is more effective for learning with the multiple types of corpora than existing domain-adaptation methods are.

pdf bib
Effective Use of Target-side Context for Neural Machine Translation
Hideya Mino | Hitoshi Ito | Isao Goto | Ichiro Yamada | Takenobu Tokunaga
Proceedings of the 28th International Conference on Computational Linguistics

In this paper, we deal with two problems in Japanese-English machine translation of news articles. The first problem is the quality of parallel corpora. Neural machine translation (NMT) systems suffer degraded performance when trained with noisy data. Because there is no clean Japanese-English parallel data for news articles, we build a novel parallel news corpus consisting of Japanese news articles translated into English in a content-equivalent manner. This is the first content-equivalent Japanese-English news corpus translated specifically for training NMT systems. The second problem involves the domain-adaptation technique. NMT systems suffer degraded performance when trained with mixed data having different features, such as noisy data and clean data. Though the existing methods try to overcome this problem by using tags for distinguishing the differences between corpora, it is not sufficient. We thus extend a domain-adaptation method using multi-tags to train an NMT model effectively with the clean corpus and existing parallel news corpora with some types of noise. Experimental results show that our corpus increases the translation quality, and that our domain-adaptation method is more effective for learning with the multiple types of corpora than existing domain-adaptation methods are.

2019

pdf bib
Overview of the 6th Workshop on Asian Translation
Toshiaki Nakazawa | Nobushige Doi | Shohei Higashiyama | Chenchen Ding | Raj Dabre | Hideya Mino | Isao Goto | Win Pa Pa | Anoop Kunchukuttan | Yusuke Oda | Shantipriya Parida | Ondřej Bojar | Sadao Kurohashi
Proceedings of the 6th Workshop on Asian Translation

This paper presents the results of the shared tasks from the 6th workshop on Asian translation (WAT2019) including Ja↔En, Ja↔Zh scientific paper translation subtasks, Ja↔En, Ja↔Ko, Ja↔En patent translation subtasks, Hi↔En, My↔En, Km↔En, Ta↔En mixed domain subtasks and Ru↔Ja news commentary translation task. For the WAT2019, 25 teams participated in the shared tasks. We also received 10 research paper submissions out of which 61 were accepted. About 400 translation results were submitted to the automatic evaluation server, and selected submis- sions were manually evaluated.

pdf bib
Neural Machine Translation System using a Content-equivalently Translated Parallel Corpus for the Newswire Translation Tasks at WAT 2019
Hideya Mino | Hitoshi Ito | Isao Goto | Ichiro Yamada | Hideki Tanaka | Takenobu Tokunaga
Proceedings of the 6th Workshop on Asian Translation

This paper describes NHK and NHK Engineering System (NHK-ES)’s submission to the newswire translation tasks of WAT 2019 in both directions of Japanese→English and English→Japanese. In addition to the JIJI Corpus that was officially provided by the task organizer, we developed a corpus of 0.22M sentence pairs by manually, translating Japanese news sentences into English content- equivalently. The content-equivalent corpus was effective for improving translation quality, and our systems achieved the best human evaluation scores in the newswire translation tasks at WAT 2019.

2018

pdf bib
Overview of the 5th Workshop on Asian Translation
Toshiaki Nakazawa | Katsuhito Sudoh | Shohei Higashiyama | Chenchen Ding | Raj Dabre | Hideya Mino | Isao Goto | Win Pa Pa | Anoop Kunchukuttan | Sadao Kurohashi
Proceedings of the 32nd Pacific Asia Conference on Language, Information and Computation: 5th Workshop on Asian Translation: 5th Workshop on Asian Translation

2017

pdf bib
Key-value Attention Mechanism for Neural Machine Translation
Hideya Mino | Masao Utiyama | Eiichiro Sumita | Takenobu Tokunaga
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

In this paper, we propose a neural machine translation (NMT) with a key-value attention mechanism on the source-side encoder. The key-value attention mechanism separates the source-side content vector into two types of memory known as the key and the value. The key is used for calculating the attention distribution, and the value is used for encoding the context representation. Experiments on three different tasks indicate that our model outperforms an NMT model with a conventional attention mechanism. Furthermore, we perform experiments with a conventional NMT framework, in which a part of the initial value of a weight matrix is set to zero so that the matrix is as the same initial-state as the key-value attention mechanism. As a result, we obtain comparable results with the key-value attention mechanism without changing the network structure.

pdf bib
Overview of the 4th Workshop on Asian Translation
Toshiaki Nakazawa | Shohei Higashiyama | Chenchen Ding | Hideya Mino | Isao Goto | Hideto Kazawa | Yusuke Oda | Graham Neubig | Sadao Kurohashi
Proceedings of the 4th Workshop on Asian Translation (WAT2017)

This paper presents the results of the shared tasks from the 4th workshop on Asian translation (WAT2017) including J↔E, J↔C scientific paper translation subtasks, C↔J, K↔J, E↔J patent translation subtasks, H↔E mixed domain subtasks, J↔E newswire subtasks and J↔E recipe subtasks. For the WAT2017, 12 institutions participated in the shared tasks. About 300 translation results have been submitted to the automatic evaluation server, and selected submissions were manually evaluated.

2016

pdf bib
Proceedings of the 3rd Workshop on Asian Translation (WAT2016)
Toshiaki Nakazawa | Hideya Mino | Chenchen Ding | Isao Goto | Graham Neubig | Sadao Kurohashi | Ir. Hammam Riza | Pushpak Bhattacharyya
Proceedings of the 3rd Workshop on Asian Translation (WAT2016)

pdf bib
Overview of the 3rd Workshop on Asian Translation
Toshiaki Nakazawa | Chenchen Ding | Hideya Mino | Isao Goto | Graham Neubig | Sadao Kurohashi
Proceedings of the 3rd Workshop on Asian Translation (WAT2016)

This paper presents the results of the shared tasks from the 3rd workshop on Asian translation (WAT2016) including J ↔ E, J ↔ C scientific paper translation subtasks, C ↔ J, K ↔ J, E ↔ J patent translation subtasks, I ↔ E newswire subtasks and H ↔ E, H ↔ J mixed domain subtasks. For the WAT2016, 15 institutions participated in the shared tasks. About 500 translation results have been submitted to the automatic evaluation server, and selected submissions were manually evaluated.

2015

bib
Proceedings of the 2nd Workshop on Asian Translation (WAT2015)
Toshiaki Nakazawa | Hideya Mino | Isao Goto | Graham Neubig | Sadao Kurohashi | Eiichiro Sumita
Proceedings of the 2nd Workshop on Asian Translation (WAT2015)

pdf bib
Overview of the 2nd Workshop on Asian Translation
Toshiaki Nakazawa | Hideya Mino | Isao Goto | Graham Neubig | Sadao Kurohashi | Eiichiro Sumita
Proceedings of the 2nd Workshop on Asian Translation (WAT2015)

2014

pdf bib
Proceedings of the 1st Workshop on Asian Translation (WAT2014)
Toshiaki Nakazawa | Hideya Mino | Isao Goto | Sadao Kurohashi | Eiichiro Sumita
Proceedings of the 1st Workshop on Asian Translation (WAT2014)

pdf bib
Overview of the 1st Workshop on Asian Translation
Toshiaki Nakazawa | Hideya Mino | Isao Goto | Sadao Kurohashi | Eiichiro Sumita
Proceedings of the 1st Workshop on Asian Translation (WAT2014)

pdf bib
Syntax-Augmented Machine Translation using Syntax-Label Clustering
Hideya Mino | Taro Watanabe | Eiichiro Sumita
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)