Enhong Chen


2020

pdf bib
Jointly Masked Sequence-to-Sequence Model for Non-Autoregressive Neural Machine Translation
Junliang Guo | Linli Xu | Enhong Chen
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

The masked language model has received remarkable attention due to its effectiveness on various natural language processing tasks. However, few works have adopted this technique in the sequence-to-sequence models. In this work, we introduce a jointly masked sequence-to-sequence model and explore its application on non-autoregressive neural machine translation~(NAT). Specifically, we first empirically study the functionalities of the encoder and the decoder in NAT models, and find that the encoder takes a more important role than the decoder regarding the translation quality. Therefore, we propose to train the encoder more rigorously by masking the encoder input while training. As for the decoder, we propose to train it based on the consecutive masking of the decoder input with an n-gram loss function to alleviate the problem of translating duplicate words. The two types of masks are applied to the model jointly at the training stage. We conduct experiments on five benchmark machine translation tasks, and our model can achieve 27.69/32.24 BLEU scores on WMT14 English-German/German-English tasks with 5+ times speed up compared with an autoregressive model.

2019

pdf bib
Budgeted Policy Learning for Task-Oriented Dialogue Systems
Zhirui Zhang | Xiujun Li | Jianfeng Gao | Enhong Chen
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

This paper presents a new approach that extends Deep Dyna-Q (DDQ) by incorporating a Budget-Conscious Scheduling (BCS) to best utilize a fixed, small amount of user interactions (budget) for learning task-oriented dialogue agents. BCS consists of (1) a Poisson-based global scheduler to allocate budget over different stages of training; (2) a controller to decide at each training step whether the agent is trained using real or simulated experiences; (3) a user goal sampling module to generate the experiences that are most effective for policy learning. Experiments on a movie-ticket booking task with simulated and real users show that our approach leads to significant improvements in success rate over the state-of-the-art baselines given the fixed budget.

2018

pdf bib
Bidirectional Generative Adversarial Networks for Neural Machine Translation
Zhirui Zhang | Shujie Liu | Mu Li | Ming Zhou | Enhong Chen
Proceedings of the 22nd Conference on Computational Natural Language Learning

Generative Adversarial Network (GAN) has been proposed to tackle the exposure bias problem of Neural Machine Translation (NMT). However, the discriminator typically results in the instability of the GAN training due to the inadequate training problem: the search space is so huge that sampled translations are not sufficient for discriminator training. To address this issue and stabilize the GAN training, in this paper, we propose a novel Bidirectional Generative Adversarial Network for Neural Machine Translation (BGAN-NMT), which aims to introduce a generator model to act as the discriminator, whereby the discriminator naturally considers the entire translation space so that the inadequate training problem can be alleviated. To satisfy this property, generator and discriminator are both designed to model the joint probability of sentence pairs, with the difference that, the generator decomposes the joint probability with a source language model and a source-to-target translation model, while the discriminator is formulated as a target language model and a target-to-source translation model. To further leverage the symmetry of them, an auxiliary GAN is introduced and adopts generator and discriminator models of original one as its own discriminator and generator respectively. Two GANs are alternately trained to update the parameters. Experiment results on German-English and Chinese-English translation tasks demonstrate that our method not only stabilizes GAN training but also achieves significant improvements over baseline systems.

2017

pdf bib
Stack-based Multi-layer Attention for Transition-based Dependency Parsing
Zhirui Zhang | Shujie Liu | Mu Li | Ming Zhou | Enhong Chen
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

Although sequence-to-sequence (seq2seq) network has achieved significant success in many NLP tasks such as machine translation and text summarization, simply applying this approach to transition-based dependency parsing cannot yield a comparable performance gain as in other state-of-the-art methods, such as stack-LSTM and head selection. In this paper, we propose a stack-based multi-layer attention model for seq2seq learning to better leverage structural linguistics information. In our method, two binary vectors are used to track the decoding stack in transition-based parsing, and multi-layer attention is introduced to capture multiple word dependencies in partial trees. We conduct experiments on PTB and CTB datasets, and the results show that our proposed model achieves state-of-the-art accuracy and significant improvement in labeled precision with respect to the baseline seq2seq model.

2016

pdf bib
Chinese Poetry Generation with Planning based Neural Network
Zhe Wang | Wei He | Hua Wu | Haiyang Wu | Wei Li | Haifeng Wang | Enhong Chen
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Chinese poetry generation is a very challenging task in natural language processing. In this paper, we propose a novel two-stage poetry generating method which first plans the sub-topics of the poem according to the user’s writing intent, and then generates each line of the poem sequentially, using a modified recurrent neural network encoder-decoder framework. The proposed planning-based method can ensure that the generated poem is coherent and semantically consistent with the user’s intent. A comprehensive evaluation with human judgments demonstrates that our proposed approach outperforms the state-of-the-art poetry generating methods and the poem quality is somehow comparable to human poets.

2014

pdf bib
A Probabilistic Model for Learning Multi-Prototype Word Embeddings
Fei Tian | Hanjun Dai | Jiang Bian | Bin Gao | Rui Zhang | Enhong Chen | Tie-Yan Liu
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

2013

pdf bib
A Dataset for Research on Short-Text Conversations
Hao Wang | Zhengdong Lu | Hang Li | Enhong Chen
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

2010

pdf bib
Hedge Classification with Syntactic Dependency Features Based on an Ensemble Classifier
Yi Zheng | Qifeng Dai | Qiming Luo | Enhong Chen
Proceedings of the Fourteenth Conference on Computational Natural Language Learning – Shared Task

2009

pdf bib
An Iterative Approach for Joint Dependency Parsing and Semantic Role Labeling
Qifeng Dai | Enhong Chen | Liu Shi
Proceedings of the Thirteenth Conference on Computational Natural Language Learning (CoNLL 2009): Shared Task

2008

pdf bib
Probabilistic Model for Syntactic and Semantic Dependency Parsing
Enhong Chen | Liu Shi | Dawei Hu
CoNLL 2008: Proceedings of the Twelfth Conference on Computational Natural Language Learning