Jun Suzuki


2020

pdf bib
Language Models as an Alternative Evaluator of Word Order Hypotheses: A Case Study in Japanese
Tatsuki Kuribayashi | Takumi Ito | Jun Suzuki | Kentaro Inui
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

We examine a methodology using neural language models (LMs) for analyzing the word order of language. This LM-based method has the potential to overcome the difficulties existing methods face, such as the propagation of preprocessor errors in count-based methods. In this study, we explore whether the LM-based method is valid for analyzing the word order. As a case study, this study focuses on Japanese due to its complex and flexible word order. To validate the LM-based method, we test (i) parallels between LMs and human word order preference, and (ii) consistency of the results obtained using the LM-based method with previous linguistic studies. Through our experiments, we tentatively conclude that LMs display sufficient word order knowledge for usage as an analysis tool. Finally, using the LM-based method, we demonstrate the relationship between the canonical word order and topicalization, which had yet to be analyzed by large-scale experiments.

pdf bib
Evaluating Dialogue Generation Systems via Response Selection
Shiki Sato | Reina Akama | Hiroki Ouchi | Jun Suzuki | Kentaro Inui
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Existing automatic evaluation metrics for open-domain dialogue response generation systems correlate poorly with human evaluation. We focus on evaluating response generation systems via response selection. To evaluate systems properly via response selection, we propose a method to construct response selection test sets with well-chosen false candidates. Specifically, we propose to construct test sets filtering out some types of false candidates: (i) those unrelated to the ground-truth response and (ii) those acceptable as appropriate responses. Through experiments, we demonstrate that evaluating systems via response selection with the test set developed by our method correlates more strongly with human evaluation, compared with widely used automatic evaluation metrics such as BLEU.

pdf bib
Single Model Ensemble using Pseudo-Tags and Distinct Vectors
Ryosuke Kuwabara | Jun Suzuki | Hideki Nakayama
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Model ensemble techniques often increase task performance in neural networks; however, they require increased time, memory, and management effort. In this study, we propose a novel method that replicates the effects of a model ensemble with a single model. Our approach creates K-virtual models within a single parameter space using K-distinct pseudo-tags and K-distinct vectors. Experiments on text classification and sequence labeling tasks on several datasets demonstrate that our method emulates or outperforms a traditional model ensemble with 1/K-times fewer parameters.

pdf bib
Encoder-Decoder Models Can Benefit from Pre-trained Masked Language Models in Grammatical Error Correction
Masahiro Kaneko | Masato Mita | Shun Kiyono | Jun Suzuki | Kentaro Inui
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

This paper investigates how to effectively incorporate a pre-trained masked language model (MLM), such as BERT, into an encoder-decoder (EncDec) model for grammatical error correction (GEC). The answer to this question is not as straightforward as one might expect because the previous common methods for incorporating a MLM into an EncDec model have potential drawbacks when applied to GEC. For example, the distribution of the inputs to a GEC model can be considerably different (erroneous, clumsy, etc.) from that of the corpora used for pre-training MLMs; however, this issue is not addressed in the previous methods. Our experiments show that our proposed method, where we first fine-tune a MLM with a given GEC corpus and then use the output of the fine-tuned MLM as additional features in the GEC model, maximizes the benefit of the MLM. The best-performing model achieves state-of-the-art performances on the BEA-2019 and CoNLL-2014 benchmarks. Our code is publicly available at: https://github.com/kanekomasahiro/bert-gec.

pdf bib
Instance-Based Learning of Span Representations: A Case Study through Named Entity Recognition
Hiroki Ouchi | Jun Suzuki | Sosuke Kobayashi | Sho Yokoi | Tatsuki Kuribayashi | Ryuto Konno | Kentaro Inui
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Interpretable rationales for model predictions play a critical role in practical applications. In this study, we develop models possessing interpretable inference process for structured prediction. Specifically, we present a method of instance-based learning that learns similarities between spans. At inference time, each span is assigned a class label based on its similar spans in the training set, where it is easy to understand how much each training instance contributes to the predictions. Through empirical analysis on named entity recognition, we demonstrate that our method enables to build models that have high interpretability without sacrificing performance.

pdf bib
Embeddings of Label Components for Sequence Labeling: A Case Study of Fine-grained Named Entity Recognition
Takuma Kato | Kaori Abe | Hiroki Ouchi | Shumpei Miyawaki | Jun Suzuki | Kentaro Inui
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop

In general, the labels used in sequence labeling consist of different types of elements. For example, IOB-format entity labels, such as B-Person and I-Person, can be decomposed into span (B and I) and type information (Person). However, while most sequence labeling models do not consider such label components, the shared components across labels, such as Person, can be beneficial for label prediction. In this work, we propose to integrate label component information as embeddings into models. Through experiments on English and Japanese fine-grained named entity recognition, we demonstrate that the proposed method improves performance, especially for instances with low-frequency labels.

pdf bib
Preventing Critical Scoring Errors in Short Answer Scoring with Confidence Estimation
Hiroaki Funayama | Shota Sasaki | Yuichiroh Matsubayashi | Tomoya Mizumoto | Jun Suzuki | Masato Mita | Kentaro Inui
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop

Many recent Short Answer Scoring (SAS) systems have employed Quadratic Weighted Kappa (QWK) as the evaluation measure of their systems. However, we hypothesize that QWK is unsatisfactory for the evaluation of the SAS systems when we consider measuring their effectiveness in actual usage. We introduce a new task formulation of SAS that matches the actual usage. In our formulation, the SAS systems should extract as many scoring predictions that are not critical scoring errors (CSEs). We conduct the experiments in our new task formulation and demonstrate that a typical SAS system can predict scores with zero CSE for approximately 50% of test data at maximum by filtering out low-reliablility predictions on the basis of a certain confidence estimation. This result directly indicates the possibility of reducing half the scoring cost of human raters, which is more preferable for the evaluation of SAS systems.

pdf bib
JParaCrawl: A Large Scale Web-Based English-Japanese Parallel Corpus
Makoto Morishita | Jun Suzuki | Masaaki Nagata
Proceedings of the 12th Language Resources and Evaluation Conference

Recent machine translation algorithms mainly rely on parallel corpora. However, since the availability of parallel corpora remains limited, only some resource-rich language pairs can benefit from them. We constructed a parallel corpus for English-Japanese, for which the amount of publicly available parallel corpora is still limited. We constructed the parallel corpus by broadly crawling the web and automatically aligning parallel sentences. Our collected corpus, called JParaCrawl, amassed over 8.7 million sentence pairs. We show how it includes a broader range of domains and how a neural machine translation model trained with it works as a good pre-trained model for fine-tuning specific domains. The pre-training and fine-tuning approaches achieved or surpassed performance comparable to model training from the initial state and reduced the training time. Additionally, we trained the model with an in-domain dataset and JParaCrawl to show how we achieved the best performance with them. JParaCrawl and the pre-trained models are freely available online for research purposes.

pdf bib
Seeing the World through Text: Evaluating Image Descriptions for Commonsense Reasoning in Machine Reading Comprehension
Diana Galvan-Sosa | Jun Suzuki | Kyosuke Nishida | Koji Matsuda | Kentaro Inui
Proceedings of the Second Workshop on Beyond Vision and LANguage: inTEgrating Real-world kNowledge (LANTERN)

Despite recent achievements in natural language understanding, reasoning over commonsense knowledge still represents a big challenge to AI systems. As the name suggests, common sense is related to perception and as such, humans derive it from experience rather than from literary education. Recent works in the NLP and the computer vision field have made the effort of making such knowledge explicit using written language and visual inputs, respectively. Our premise is that the latter source fits better with the characteristics of commonsense acquisition. In this work, we explore to what extent the descriptions of real-world scenes are sufficient to learn common sense about different daily situations, drawing upon visual information to answer script knowledge questions.

pdf bib
A Self-Refinement Strategy for Noise Reduction in Grammatical Error Correction
Masato Mita | Shun Kiyono | Masahiro Kaneko | Jun Suzuki | Kentaro Inui
Findings of the Association for Computational Linguistics: EMNLP 2020

Existing approaches for grammatical error correction (GEC) largely rely on supervised learning with manually created GEC datasets. However, there has been little focus on verifying and ensuring the quality of the datasets, and on how lower-quality data might affect GEC performance. We indeed found that there is a non-negligible amount of “noise” where errors were inappropriately edited or left uncorrected. To address this, we designed a self-refinement method where the key idea is to denoise these datasets by leveraging the prediction consistency of existing models, and outperformed strong denoising baseline methods. We further applied task-specific techniques and achieved state-of-the-art performance on the CoNLL-2014, JFLEG, and BEA-2019 benchmarks. We then analyzed the effect of the proposed denoising method, and found that our approach leads to improved coverage of corrections and facilitated fluency edits which are reflected in higher recall and overall performance.

pdf bib
Efficient Estimation of Influence of a Training Instance
Sosuke Kobayashi | Sho Yokoi | Jun Suzuki | Kentaro Inui
Proceedings of SustaiNLP: Workshop on Simple and Efficient Natural Language Processing

Understanding the influence of a training instance on a neural network model leads to improving interpretability. However, it is difficult and inefficient to evaluate the influence, which shows how a model’s prediction would be changed if a training instance were not used. In this paper, we propose an efficient method for estimating the influence. Our method is inspired by dropout, which zero-masks a sub-network and prevents the sub-network from learning each training instance. By switching between dropout masks, we can use sub-networks that learned or did not learn each training instance and estimate its influence. Through experiments with BERT and VGGNet on classification datasets, we demonstrate that the proposed method can capture training influences, enhance the interpretability of error predictions, and cleanse the training dataset for improving generalization.

pdf bib
PheMT: A Phenomenon-wise Dataset for Machine Translation Robustness on User-Generated Contents
Ryo Fujii | Masato Mita | Kaori Abe | Kazuaki Hanawa | Makoto Morishita | Jun Suzuki | Kentaro Inui
Proceedings of the 28th International Conference on Computational Linguistics

Neural Machine Translation (NMT) has shown drastic improvement in its quality when translating clean input, such as text from the news domain. However, existing studies suggest that NMT still struggles with certain kinds of input with considerable noise, such as User-Generated Contents (UGC) on the Internet. To make better use of NMT for cross-cultural communication, one of the most promising directions is to develop a model that correctly handles these expressions. Though its importance has been recognized, it is still not clear as to what creates the great gap in performance between the translation of clean input and that of UGC. To answer the question, we present a new dataset, PheMT, for evaluating the robustness of MT systems against specific linguistic phenomena in Japanese-English translation. Our experiments with the created dataset revealed that not only our in-house models but even widely used off-the-shelf systems are greatly disturbed by the presence of certain phenomena.

pdf bib
Filtering Noisy Dialogue Corpora by Connectivity and Content Relatedness
Reina Akama | Sho Yokoi | Jun Suzuki | Kentaro Inui
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Large-scale dialogue datasets have recently become available for training neural dialogue agents. However, these datasets have been reported to contain a non-negligible number of unacceptable utterance pairs. In this paper, we propose a method for scoring the quality of utterance pairs in terms of their connectivity and relatedness. The proposed scoring method is designed based on findings widely shared in the dialogue and linguistics research communities. We demonstrate that it has a relatively good correlation with the human judgment of dialogue quality. Furthermore, the method is applied to filter out potentially unacceptable utterance pairs from a large-scale noisy dialogue corpus to ensure its quality. We experimentally confirm that training data filtered by the proposed method improves the quality of neural dialogue agents in response generation.

pdf bib
Word Rotator’s Distance
Sho Yokoi | Ryo Takahashi | Reina Akama | Jun Suzuki | Kentaro Inui
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

One key principle for assessing textual similarity is measuring the degree of semantic overlap between texts by considering the word alignment. Such alignment-based approaches are both intuitive and interpretable; however, they are empirically inferior to the simple cosine similarity between general-purpose sentence vectors. We focus on the fact that the norm of word vectors is a good proxy for word importance, and the angle of them is a good proxy for word similarity. However, alignment-based approaches do not distinguish the norm and direction, whereas sentence-vector approaches automatically use the norm as the word importance. Accordingly, we propose decoupling word vectors into their norm and direction then computing the alignment-based similarity with the help of earth mover’s distance (optimal transport), which we refer to as word rotator’s distance. Furthermore, we demonstrate how to grow the norm and direction of word vectors (vector converter); this is a new systematic approach derived from the sentence-vector estimation methods, which can significantly improve the performance of the proposed method. On several STS benchmarks, the proposed methods outperform not only alignment-based approaches but also strong baselines. The source code is avaliable at https://github.com/eumesy/wrd

pdf bib
Langsmith: An Interactive Academic Text Revision System
Takumi Ito | Tatsuki Kuribayashi | Masatoshi Hidaka | Jun Suzuki | Kentaro Inui
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations

Despite the current diversity and inclusion initiatives in the academic community, researchers with a non-native command of English still face significant obstacles when writing papers in English. This paper presents the Langsmith editor, which assists inexperienced, non-native researchers to write English papers, especially in the natural language processing (NLP) field. Our system can suggest fluent, academic-style sentences to writers based on their rough, incomplete phrases or sentences. The system also encourages interaction between human writers and the computerized revision system. The experimental results demonstrated that Langsmith helps non-native English-speaker students write papers in English. The system is available at https://emnlp-demo.editor. langsmith.co.jp/.

2019

pdf bib
Select and Attend: Towards Controllable Content Selection in Text Generation
Xiaoyu Shen | Jun Suzuki | Kentaro Inui | Hui Su | Dietrich Klakow | Satoshi Sekine
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Many text generation tasks naturally contain two steps: content selection and surface realization. Current neural encoder-decoder models conflate both steps into a black-box architecture. As a result, the content to be described in the text cannot be explicitly controlled. This paper tackles this problem by decoupling content selection from the decoder. The decoupled content selection is human interpretable, whose value can be manually manipulated to control the content of generated text. The model can be trained end-to-end without human annotations by maximizing a lower bound of the marginal likelihood. We further propose an effective way to trade-off between performance and controllability with a single adjustable hyperparameter. In both data-to-text and headline generation tasks, our model achieves promising results, paving the way for controllable content selection in text generation.

pdf bib
An Empirical Study of Incorporating Pseudo Data into Grammatical Error Correction
Shun Kiyono | Jun Suzuki | Masato Mita | Tomoya Mizumoto | Kentaro Inui
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

The incorporation of pseudo data in the training of grammatical error correction models has been one of the main factors in improving the performance of such models. However, consensus is lacking on experimental configurations, namely, choosing how the pseudo data should be generated or used. In this study, these choices are investigated through extensive experiments, and state-of-the-art performance is achieved on the CoNLL-2014 test set (F0.5=65.0) and the official test set of the BEA-2019 shared task (F0.5=70.2) without making any modifications to the model architecture.

pdf bib
Transductive Learning of Neural Language Models for Syntactic and Semantic Analysis
Hiroki Ouchi | Jun Suzuki | Kentaro Inui
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

In transductive learning, an unlabeled test set is used for model training. Although this setting deviates from the common assumption of a completely unseen test set, it is applicable in many real-world scenarios, wherein the texts to be processed are known in advance. However, despite its practical advantages, transductive learning is underexplored in natural language processing. Here we conduct an empirical study of transductive learning for neural models and demonstrate its utility in syntactic and semantic tasks. Specifically, we fine-tune language models (LMs) on an unlabeled test set to obtain test-set-specific word representations. Through extensive experiments, we demonstrate that despite its simplicity, transductive LM fine-tuning consistently improves state-of-the-art neural models in in-domain and out-of-domain settings.

pdf bib
TEASPN: Framework and Protocol for Integrated Writing Assistance Environments
Masato Hagiwara | Takumi Ito | Tatsuki Kuribayashi | Jun Suzuki | Kentaro Inui
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP): System Demonstrations

Language technologies play a key role in assisting people with their writing. Although there has been steady progress in e.g., grammatical error correction (GEC), human writers are yet to benefit from this progress due to the high development cost of integrating with writing software. We propose TEASPN, a protocol and an open-source framework for achieving integrated writing assistance environments. The protocol standardizes the way writing software communicates with servers that implement such technologies, allowing developers and researchers to integrate the latest developments in natural language processing (NLP) with low cost. As a result, users can enjoy the integrated experience in their favorite writing software. The results from experiments with human participants show that users use a wide range of technologies and rate their writing experience favorably, allowing them to write more fluent text.

pdf bib
NTT Neural Machine Translation Systems at WAT 2019
Makoto Morishita | Jun Suzuki | Masaaki Nagata
Proceedings of the 6th Workshop on Asian Translation

In this paper, we describe our systems that were submitted to the translation shared tasks at WAT 2019. This year, we participated in two distinct types of subtasks, a scientific paper subtask and a timely disclosure subtask, where we only considered English-to-Japanese and Japanese-to-English translation directions. We submitted two systems (En-Ja and Ja-En) for the scientific paper subtask and two systems (Ja-En, texts, items) for the timely disclosure subtask. Three of our four systems obtained the best human evaluation performances. We also confirmed that our new additional web-crawled parallel corpus improves the performance in unconstrained settings.

pdf bib
Effective Adversarial Regularization for Neural Machine Translation
Motoki Sato | Jun Suzuki | Shun Kiyono
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

A regularization technique based on adversarial perturbation, which was initially developed in the field of image processing, has been successfully applied to text classification tasks and has yielded attractive improvements. We aim to further leverage this promising methodology into more sophisticated and critical neural models in the natural language processing field, i.e., neural machine translation (NMT) models. However, it is not trivial to apply this methodology to such models. Thus, this paper investigates the effectiveness of several possible configurations of applying the adversarial perturbation and reveals that the adversarial regularization technique can significantly and consistently improve the performance of widely used NMT models, such as LSTM-based and Transformer-based models.

pdf bib
An Empirical Study of Span Representations in Argumentation Structure Parsing
Tatsuki Kuribayashi | Hiroki Ouchi | Naoya Inoue | Paul Reisert | Toshinori Miyoshi | Jun Suzuki | Kentaro Inui
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

For several natural language processing (NLP) tasks, span representation design is attracting considerable attention as a promising new technique; a common basis for an effective design has been established. With such basis, exploring task-dependent extensions for argumentation structure parsing (ASP) becomes an interesting research direction. This study investigates (i) span representation originally developed for other NLP tasks and (ii) a simple task-dependent extension for ASP. Our extensive experiments and analysis show that these representations yield high performance for ASP and provide some challenging types of instances to be parsed.

pdf bib
Annotating with Pros and Cons of Technologies in Computer Science Papers
Hono Shirai | Naoya Inoue | Jun Suzuki | Kentaro Inui
Proceedings of the Workshop on Extracting Structured Knowledge from Scientific Publications

This paper explores a task for extracting a technological expression and its pros/cons from computer science papers. We report ongoing efforts on an annotated corpus of pros/cons and an analysis of the nature of the automatic extraction task. Specifically, we show how to adapt the targeted sentiment analysis task for pros/cons extraction in computer science papers and conduct an annotation study. In order to identify the challenges of the automatic extraction task, we construct a strong baseline model and conduct an error analysis. The experiments show that pros/cons can be consistently annotated by several annotators, and that the task is challenging due to domain-specific knowledge. The annotated dataset is made publicly available for research purposes.

pdf bib
The AIP-Tohoku System at the BEA-2019 Shared Task
Hiroki Asano | Masato Mita | Tomoya Mizumoto | Jun Suzuki
Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications

We introduce the AIP-Tohoku grammatical error correction (GEC) system for the BEA-2019 shared task in Track 1 (Restricted Track) and Track 2 (Unrestricted Track) using the same system architecture. Our system comprises two key components: error generation and sentence-level error detection. In particular, GEC with sentence-level grammatical error detection is a novel and versatile approach, and we experimentally demonstrate that it significantly improves the precision of the base model. Our system is ranked 9th in Track 1 and 2nd in Track 2.

pdf bib
Diamonds in the Rough: Generating Fluent Sentences from Early-Stage Drafts for Academic Writing Assistance
Takumi Ito | Tatsuki Kuribayashi | Hayato Kobayashi | Ana Brassard | Masato Hagiwara | Jun Suzuki | Kentaro Inui
Proceedings of the 12th International Conference on Natural Language Generation

The writing process consists of several stages such as drafting, revising, editing, and proofreading. Studies on writing assistance, such as grammatical error correction (GEC), have mainly focused on sentence editing and proofreading, where surface-level issues such as typographical errors, spelling errors, or grammatical errors should be corrected. We broaden this focus to include the earlier revising stage, where sentences require adjustment to the information included or major rewriting and propose Sentence-level Revision (SentRev) as a new writing assistance task. Well-performing systems in this task can help inexperienced authors by producing fluent, complete sentences given their rough, incomplete drafts. We build a new freely available crowdsourced evaluation dataset consisting of incomplete sentences authored by non-native writers paired with their final versions extracted from published academic papers for developing and evaluating SentRev models. We also establish baseline performance on SentRev using our newly built evaluation dataset.

pdf bib
Subword-based Compact Reconstruction of Word Embeddings
Shota Sasaki | Jun Suzuki | Kentaro Inui
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

The idea of subword-based word embeddings has been proposed in the literature, mainly for solving the out-of-vocabulary (OOV) word problem observed in standard word-based word embeddings. In this paper, we propose a method of reconstructing pre-trained word embeddings using subword information that can effectively represent a large number of subword embeddings in a considerably small fixed space. The key techniques of our method are twofold: memory-shared embeddings and a variant of the key-value-query self-attention mechanism. Our experiments show that our reconstructed subword-based embeddings can successfully imitate well-trained word embeddings in a small fixed space while preventing quality degradation across several linguistic benchmark datasets, and can simultaneously predict effective embeddings of OOV words. We also demonstrate the effectiveness of our reconstruction method when we apply them to downstream tasks.

pdf bib
The Sally Smedley Hyperpartisan News Detector at SemEval-2019 Task 4
Kazuaki Hanawa | Shota Sasaki | Hiroki Ouchi | Jun Suzuki | Kentaro Inui
Proceedings of the 13th International Workshop on Semantic Evaluation

This paper describes our system submitted to the formal run of SemEval-2019 Task 4: Hyperpartisan news detection. Our system is based on a linear classifier using several features, i.e., 1) embedding features based on the pre-trained BERT embeddings, 2) article length features, and 3) embedding features of informative phrases extracted from by-publisher dataset. Our system achieved 80.9% accuracy on the test set for the formal run and got the 3rd place out of 42 teams.

2018

pdf bib
Pointwise HSIC: A Linear-Time Kernelized Co-occurrence Norm for Sparse Linguistic Expressions
Sho Yokoi | Sosuke Kobayashi | Kenji Fukumizu | Jun Suzuki | Kentaro Inui
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

In this paper, we propose a new kernel-based co-occurrence measure that can be applied to sparse linguistic expressions (e.g., sentences) with a very short learning time, as an alternative to pointwise mutual information (PMI). As well as deriving PMI from mutual information, we derive this new measure from the Hilbert–Schmidt independence criterion (HSIC); thus, we call the new measure the pointwise HSIC (PHSIC). PHSIC can be interpreted as a smoothed variant of PMI that allows various similarity metrics (e.g., sentence embeddings) to be plugged in as kernels. Moreover, PHSIC can be estimated by simple and fast (linear in the size of the data) matrix calculations regardless of whether we use linear or nonlinear kernels. Empirically, in a dialogue response selection task, PHSIC is learned thousands of times faster than an RNN-based PMI while outperforming PMI in accuracy. In addition, we also demonstrate that PHSIC is beneficial as a criterion of a data selection task for machine translation owing to its ability to give high (low) scores to a consistent (inconsistent) pair with other pairs.

pdf bib
Direct Output Connection for a High-Rank Language Model
Sho Takase | Jun Suzuki | Masaaki Nagata
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

This paper proposes a state-of-the-art recurrent neural network (RNN) language model that combines probability distributions computed not only from a final RNN layer but also middle layers. This method raises the expressive power of a language model based on the matrix factorization interpretation of language modeling introduced by Yang et al. (2018). Our proposed method improves the current state-of-the-art language model and achieves the best score on the Penn Treebank and WikiText-2, which are the standard benchmark datasets. Moreover, we indicate our proposed method contributes to application tasks: machine translation and headline generation.

pdf bib
Improving Neural Machine Translation by Incorporating Hierarchical Subword Features
Makoto Morishita | Jun Suzuki | Masaaki Nagata
Proceedings of the 27th International Conference on Computational Linguistics

This paper focuses on subword-based Neural Machine Translation (NMT). We hypothesize that in the NMT model, the appropriate subword units for the following three modules (layers) can differ: (1) the encoder embedding layer, (2) the decoder embedding layer, and (3) the decoder output layer. We find the subword based on Sennrich et al. (2016) has a feature that a large vocabulary is a superset of a small vocabulary and modify the NMT model enables the incorporation of several different subword units in a single embedding layer. We refer these small subword features as hierarchical subword features. To empirically investigate our assumption, we compare the performance of several different subword units and hierarchical subword features for both the encoder and decoder embedding layers. We confirmed that incorporating hierarchical subword features in the encoder consistently improves BLEU scores on the IWSLT evaluation datasets.

pdf bib
An Empirical Study of Building a Strong Baseline for Constituency Parsing
Jun Suzuki | Sho Takase | Hidetaka Kamigaito | Makoto Morishita | Masaaki Nagata
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

This paper investigates the construction of a strong baseline based on general purpose sequence-to-sequence models for constituency parsing. We incorporate several techniques that were mainly developed in natural language generation tasks, e.g., machine translation and summarization, and demonstrate that the sequence-to-sequence model achieves the current top-notch parsers’ performance (almost) without requiring any explicit task-specific knowledge or architecture of constituent parsing.

pdf bib
Unsupervised Token-wise Alignment to Improve Interpretation of Encoder-Decoder Models
Shun Kiyono | Sho Takase | Jun Suzuki | Naoaki Okazaki | Kentaro Inui | Masaaki Nagata
Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP

Developing a method for understanding the inner workings of black-box neural methods is an important research endeavor. Conventionally, many studies have used an attention matrix to interpret how Encoder-Decoder-based models translate a given source sentence to the corresponding target sentence. However, recent studies have empirically revealed that an attention matrix is not optimal for token-wise translation analyses. We propose a method that explicitly models the token-wise alignment between the source and target sequences to provide a better analysis. Experiments show that our method can acquire token-wise alignments that are superior to those of an attention mechanism.

pdf bib
NTT’s Neural Machine Translation Systems for WMT 2018
Makoto Morishita | Jun Suzuki | Masaaki Nagata
Proceedings of the Third Conference on Machine Translation: Shared Task Papers

This paper describes NTT’s neural machine translation systems submitted to the WMT 2018 English-German and German-English news translation tasks. Our submission has three main components: the Transformer model, corpus cleaning, and right-to-left n-best re-ranking techniques. Through our experiments, we identified two keys for improving accuracy: filtering noisy training sentences and right-to-left re-ranking. We also found that the Transformer model requires more training data than the RNN-based model, and the RNN-based model sometimes achieves better accuracy than the Transformer model when the corpus is small.

pdf bib
Reducing Odd Generation from Neural Headline Generation
Shun Kiyono | Sho Takase | Jun Suzuki | Naoaki Okazaki | Kentaro Inui | Masaaki Nagata
Proceedings of the 32nd Pacific Asia Conference on Language, Information and Computation

2017

pdf bib
Enumeration of Extractive Oracle Summaries
Tsutomu Hirao | Masaaki Nishino | Jun Suzuki | Masaaki Nagata
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers

To analyze the limitations and the future directions of the extractive summarization paradigm, this paper proposes an Integer Linear Programming (ILP) formulation to obtain extractive oracle summaries in terms of ROUGE-N. We also propose an algorithm that enumerates all of the oracle summaries for a set of reference summaries to exploit F-measures that evaluate which system summaries contain how many sentences that are extracted as an oracle summary. Our experimental results obtained from Document Understanding Conference (DUC) corpora demonstrated the following: (1) room still exists to improve the performance of extractive summarization; (2) the F-measures derived from the enumerated oracle summaries have significantly stronger correlations with human judgment than those derived from single oracle summaries.

pdf bib
Cutting-off Redundant Repeating Generations for Neural Abstractive Summarization
Jun Suzuki | Masaaki Nagata
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers

This paper tackles the reduction of redundant repeating generation that is often observed in RNN-based encoder-decoder models. Our basic idea is to jointly estimate the upper-bound frequency of each target vocabulary in the encoder and control the output words based on the estimation in the decoder. Our method shows significant improvement over a strong RNN-based encoder-decoder baseline and achieved its best results on an abstractive summarization benchmark.

pdf bib
Input-to-Output Gate to Improve RNN Language Models
Sho Takase | Jun Suzuki | Masaaki Nagata
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

This paper proposes a reinforcing method that refines the output layers of existing Recurrent Neural Network (RNN) language models. We refer to our proposed method as Input-to-Output Gate (IOG). IOG has an extremely simple structure, and thus, can be easily combined with any RNN language models. Our experiments on the Penn Treebank and WikiText-2 datasets demonstrate that IOG consistently boosts the performance of several different types of current topline RNN language models.

pdf bib
Improving Neural Text Normalization with Data Augmentation at Character- and Morphological Levels
Itsumi Saito | Jun Suzuki | Kyosuke Nishida | Kugatsu Sadamitsu | Satoshi Kobashikawa | Ryo Masumura | Yuji Matsumoto | Junji Tomita
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

In this study, we investigated the effectiveness of augmented data for encoder-decoder-based neural normalization models. Attention based encoder-decoder models are greatly effective in generating many natural languages. % such as machine translation or machine summarization. In general, we have to prepare for a large amount of training data to train an encoder-decoder model. Unlike machine translation, there are few training data for text-normalization tasks. In this paper, we propose two methods for generating augmented data. The experimental results with Japanese dialect normalization indicate that our methods are effective for an encoder-decoder model and achieve higher BLEU score than that of baselines. We also investigated the oracle performance and revealed that there is sufficient room for improving an encoder-decoder model.

pdf bib
NTT Neural Machine Translation Systems at WAT 2017
Makoto Morishita | Jun Suzuki | Masaaki Nagata
Proceedings of the 4th Workshop on Asian Translation (WAT2017)

In this year, we participated in four translation subtasks at WAT 2017. Our model structure is quite simple but we used it with well-tuned hyper-parameters, leading to a significant improvement compared to the previous state-of-the-art system. We also tried to make use of the unreliable part of the provided parallel corpus by back-translating and making a synthetic corpus. Our submitted system achieved the new state-of-the-art performance in terms of the BLEU score, as well as human evaluation.

2016

pdf bib
Neural Headline Generation on Abstract Meaning Representation
Sho Takase | Jun Suzuki | Naoaki Okazaki | Tsutomu Hirao | Masaaki Nagata
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf bib
Right-truncatable Neural Word Embeddings
Jun Suzuki | Masaaki Nagata
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Phrase Table Pruning via Submodular Function Maximization
Masaaki Nishino | Jun Suzuki | Masaaki Nagata
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

2015

pdf bib
A Unified Learning Framework of Skip-Grams and Global Vectors
Jun Suzuki | Masaaki Nagata
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

2014

pdf bib
Dependency-based Discourse Parser for Single-Document Summarization
Yasuhisa Yoshida | Jun Suzuki | Tsutomu Hirao | Masaaki Nagata
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

2013

pdf bib
Shift-Reduce Word Reordering for Machine Translation
Katsuhiko Hayashi | Katsuhito Sudoh | Hajime Tsukada | Jun Suzuki | Masaaki Nagata
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

pdf bib
Supervised Model Learning with Feature Grouping based on a Discrete Constraint
Jun Suzuki | Masaaki Nagata
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

2011

pdf bib
Distributed Minimum Error Rate Training of SMT using Particle Swarm Optimization
Jun Suzuki | Kevin Duh | Masaaki Nagata
Proceedings of 5th International Joint Conference on Natural Language Processing

pdf bib
Learning Condensed Feature Representations from Large Unsupervised Data Sets for Supervised Learning
Jun Suzuki | Hideki Isozaki | Masaaki Nagata
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

2009

pdf bib
An Empirical Study of Semi-supervised Structured Conditional Models for Dependency Parsing
Jun Suzuki | Hideki Isozaki | Xavier Carreras | Michael Collins
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

pdf bib
A Syntax-Free Approach to Japanese Sentence Compression
Tsutomu Hirao | Jun Suzuki | Hideki Isozaki
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP

2008

pdf bib
Semi-Supervised Sequential Labeling and Segmentation Using Giga-Word Scale Unlabeled Data
Jun Suzuki | Hideki Isozaki
Proceedings of ACL-08: HLT

pdf bib
Multi-label Text Categorization with Model Combination based on F1-score Maximization
Akinori Fujino | Hideki Isozaki | Jun Suzuki
Proceedings of the Third International Joint Conference on Natural Language Processing: Volume-II

2007

pdf bib
Online Large-Margin Training for Statistical Machine Translation
Taro Watanabe | Jun Suzuki | Hajime Tsukada | Hideki Isozaki
Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)

pdf bib
Semi-Supervised Structured Output Learning Based on a Hybrid Generative and Discriminative Approach
Jun Suzuki | Akinori Fujino | Hideki Isozaki
Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)

2006

pdf bib
Training Conditional Random Fields with Multivariate Evaluation Measures
Jun Suzuki | Erik McDermott | Hideki Isozaki
Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics

2005

pdf bib
Boosting-based Parse Reranking with Subtree Features
Taku Kudo | Jun Suzuki | Hideki Isozaki
Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05)

2004

pdf bib
Convolution Kernels with Feature Selection for Natural Language Processing Tasks
Jun Suzuki | Hideki Isozaki | Eisaku Maeda
Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04)

pdf bib
Dependency-based Sentence Alignment for Multiple Document Summarization
Tsutomu Hirao | Jun Suzuki | Hideki Isozaki | Eisaku Maeda
COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics

2003

pdf bib
Question Classification using HDAG Kernel
Jun Suzuki | Hirotoshi Taira | Yutaka Sasaki | Eisaku Maeda
Proceedings of the ACL 2003 Workshop on Multilingual Summarization and Question Answering

pdf bib
Hierarchical Directed Acyclic Graph Kernel: Methods for Structured Natural Language Data
Jun Suzuki | Tsutomu Hirao | Yutaka Sasaki | Eisaku Maeda
Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics

2002

pdf bib
SVM Answer Selection for Open-Domain Question Answering
Jun Suzuki | Yutaka Sasaki | Eisaku Maeda
COLING 2002: The 19th International Conference on Computational Linguistics