Jianfeng Gao


2020

pdf bib
SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization
Haoming Jiang | Pengcheng He | Weizhu Chen | Xiaodong Liu | Jianfeng Gao | Tuo Zhao
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Transfer learning has fundamentally changed the landscape of natural language processing (NLP). Many state-of-the-art models are first pre-trained on a large text corpus and then fine-tuned on downstream tasks. However, due to limited data resources from downstream tasks and the extremely high complexity of pre-trained models, aggressive fine-tuning often causes the fine-tuned model to overfit the training data of downstream tasks and fail to generalize to unseen data. To address such an issue in a principled manner, we propose a new learning framework for robust and efficient fine-tuning for pre-trained models to attain better generalization performance. The proposed framework contains two important ingredients: 1. Smoothness-inducing regularization, which effectively manages the complexity of the model; 2. Bregman proximal point optimization, which is an instance of trust-region methods and can prevent aggressive updating. Our experiments show that the proposed framework achieves new state-of-the-art performance on a number of NLP tasks including GLUE, SNLI, SciTail and ANLI. Moreover, it also outperforms the state-of-the-art T5 model, which is the largest pre-trained model containing 11 billion parameters, on GLUE.

pdf bib
MIND: A Large-scale Dataset for News Recommendation
Fangzhao Wu | Ying Qiao | Jiun-Hung Chen | Chuhan Wu | Tao Qi | Jianxun Lian | Danyang Liu | Xing Xie | Jianfeng Gao | Winnie Wu | Ming Zhou
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

News recommendation is an important technique for personalized news service. Compared with product and movie recommendations which have been comprehensively studied, the research on news recommendation is much more limited, mainly due to the lack of a high-quality benchmark dataset. In this paper, we present a large-scale dataset named MIND for news recommendation. Constructed from the user click logs of Microsoft News, MIND contains 1 million users and more than 160k English news articles, each of which has rich textual content such as title, abstract and body. We demonstrate MIND a good testbed for news recommendation through a comparative study of several state-of-the-art news recommendation methods which are originally developed on different proprietary datasets. Our results show the performance of news recommendation highly relies on the quality of news content understanding and user interest modeling. Many natural language processing techniques such as effective text representation methods and pre-trained language models can effectively improve the performance of news recommendation. The MIND dataset will be available at https://msnews.github.io.

pdf bib
The Microsoft Toolkit of Multi-Task Deep Neural Networks for Natural Language Understanding
Xiaodong Liu | Yu Wang | Jianshu Ji | Hao Cheng | Xueyun Zhu | Emmanuel Awa | Pengcheng He | Weizhu Chen | Hoifung Poon | Guihong Cao | Jianfeng Gao
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations

We present MT-DNN, an open-source natural language understanding (NLU) toolkit that makes it easy for researchers and developers to train customized deep learning models. Built upon PyTorch and Transformers, MT-DNN is designed to facilitate rapid customization for a broad spectrum of NLU tasks, using a variety of objectives (classification, regression, structured prediction) and text encoders (e.g., RNNs, BERT, RoBERTa, UniLM). A unique feature of MT-DNN is its built-in support for robust and transferable learning using the adversarial multi-task learning paradigm. To enable efficient production deployment, MT-DNN supports multi-task knowledge distillation, which can substantially compress a deep neural model without significant performance drop. We demonstrate the effectiveness of MT-DNN on a wide range of NLU applications across general and biomedical domains. The software and pre-trained models will be publicly available at https://github.com/namisan/mt-dnn.

pdf bib
ConvLab-2: An Open-Source Toolkit for Building, Evaluating, and Diagnosing Dialogue Systems
Qi Zhu | Zheng Zhang | Yan Fang | Xiang Li | Ryuichi Takanobu | Jinchao Li | Baolin Peng | Jianfeng Gao | Xiaoyan Zhu | Minlie Huang
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations

We present ConvLab-2, an open-source toolkit that enables researchers to build task-oriented dialogue systems with state-of-the-art models, perform an end-to-end evaluation, and diagnose the weakness of systems. As the successor of ConvLab, ConvLab-2 inherits ConvLab’s framework but integrates more powerful dialogue models and supports more datasets. Besides, we have developed an analysis tool and an interactive tool to assist researchers in diagnosing dialogue systems. The analysis tool presents rich statistics and summarizes common mistakes from simulated dialogues, which facilitates error analysis and system improvement. The interactive tool provides an user interface that allows developers to diagnose an assembled dialogue system by interacting with the system and modifying the output of each system component.

pdf bib
DIALOGPT : Large-Scale Generative Pre-training for Conversational Response Generation
Yizhe Zhang | Siqi Sun | Michel Galley | Yen-Chun Chen | Chris Brockett | Xiang Gao | Jianfeng Gao | Jingjing Liu | Bill Dolan
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations

We present a large, tunable neural conversational response generation model, DIALOGPT (dialogue generative pre-trained transformer). Trained on 147M conversation-like exchanges extracted from Reddit comment chains over a period spanning from 2005 through 2017, DialoGPT extends the Hugging Face PyTorch transformer to attain a performance close to human both in terms of automatic and human evaluation in single-turn dialogue settings. We show that conversational systems that leverage DialoGPT generate more relevant, contentful and context-consistent responses than strong baseline systems. The pre-trained model and training pipeline are publicly released to facilitate research into neural response generation and the development of more intelligent open-domain dialogue systems.

pdf bib
Conversation Learner - A Machine Teaching Tool for Building Dialog Managers for Task-Oriented Dialog Systems
Swadheen Shukla | Lars Liden | Shahin Shayandeh | Eslam Kamal | Jinchao Li | Matt Mazzola | Thomas Park | Baolin Peng | Jianfeng Gao
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations

Traditionally, industry solutions for building a task-oriented dialog system have relied on helping dialog authors define rule-based dialog managers, represented as dialog flows. While dialog flows are intuitively interpretable and good for simple scenarios, they fall short of performance in terms of the flexibility needed to handle complex dialogs. On the other hand, purely machine-learned models can handle complex dialogs, but they are considered to be black boxes and require large amounts of training data. In this demonstration, we showcase Conversation Learner, a machine teaching tool for building dialog managers. It combines the best of both approaches by enabling dialog authors to create a dialog flow using familiar tools, converting the dialog flow into a parametric model (e.g., neural networks), and allowing dialog authors to improve the dialog manager (i.e., the parametric model) over time by leveraging user-system dialog logs as training data through a machine teaching interface.

pdf bib
Few-shot Natural Language Generation for Task-Oriented Dialog
Baolin Peng | Chenguang Zhu | Chunyuan Li | Xiujun Li | Jinchao Li | Michael Zeng | Jianfeng Gao
Findings of the Association for Computational Linguistics: EMNLP 2020

As a crucial component in task-oriented dialog systems, the Natural Language Generation (NLG) module converts a dialog act represented in a semantic form into a response in natural language. The success of traditional template-based or statistical models typically relies on heavily annotated data, which is infeasible for new domains. Therefore, it is pivotal for an NLG system to generalize well with limited labelled data in real applications. To this end, we present FewshotWOZ, the first NLG benchmark to simulate the few-shot learning setting in task-oriented dialog systems. Further, we develop the SC-GPT model. It is pre-trained on a large set of annotated NLG corpus to acquire the controllable generation ability, and fine-tuned with only a few domain-specific labels to adapt to new domains. Experiments on FewshotWOZ and the large Multi-Domain-WOZ datasets show that the proposed SC-GPT significantly outperforms existing methods, measured by various automatic metrics and human evaluations.

pdf bib
RMM: A Recursive Mental Model for Dialogue Navigation
Homero Roman Roman | Yonatan Bisk | Jesse Thomason | Asli Celikyilmaz | Jianfeng Gao
Findings of the Association for Computational Linguistics: EMNLP 2020

Language-guided robots must be able to both ask humans questions and understand answers. Much existing work focuses only on the latter. In this paper, we go beyond instruction following and introduce a two-agent task where one agent navigates and asks questions that a second, guiding agent answers. Inspired by theory of mind, we propose the Recursive Mental Model (RMM). The navigating agent models the guiding agent to simulate answers given candidate generated questions. The guiding agent in turn models the navigating agent to simulate navigation steps it would take to generate answers. We use the progress agents make towards the goal as a reinforcement learning reward signal to directly inform not only navigation actions, but also both question and answer generation. We demonstrate that RMM enables better generalization to novel environments. Interlocutor modelling may be a way forward for human-agent RMM where robots need to both ask and answer questions.

pdf bib
Guided Dialogue Policy Learning without Adversarial Learning in the Loop
Ziming Li | Sungjin Lee | Baolin Peng | Jinchao Li | Julia Kiseleva | Maarten de Rijke | Shahin Shayandeh | Jianfeng Gao
Findings of the Association for Computational Linguistics: EMNLP 2020

Reinforcement learning methods have emerged as a popular choice for training an efficient and effective dialogue policy. However, these methods suffer from sparse and unstable reward signals returned by a user simulator only when a dialogue finishes. Besides, the reward signal is manually designed by human experts, which requires domain knowledge. Recently, a number of adversarial learning methods have been proposed to learn the reward function together with the dialogue policy. However, to alternatively update the dialogue policy and the reward model on the fly, we are limited to policy-gradient-based algorithms, such as REINFORCE and PPO. Moreover, the alternating training of a dialogue agent and the reward model can easily get stuck in local optima or result in mode collapse. To overcome the listed issues, we propose to decompose the adversarial training into two steps. First, we train the discriminator with an auxiliary dialogue generator and then incorporate a derived reward model into a common reinforcement learning method to guide the dialogue policy learning. This approach is applicable to both on-policy and off-policy reinforcement learning methods. Based on our extensive experimentation, we can conclude the proposed method: (1) achieves a remarkable task success rate using both on-policy and off-policy reinforcement learning methods; and (2) has potential to transfer knowledge from existing domains to a new domain.

pdf bib
The Design and Implementation of XiaoIce, an Empathetic Social Chatbot
Li Zhou | Jianfeng Gao | Di Li | Heung-Yeung Shum
Computational Linguistics, Volume 46, Issue 1 - March 2020

This article describes the development of Microsoft XiaoIce, the most popular social chatbot in the world. XiaoIce is uniquely designed as an artifical intelligence companion with an emotional connection to satisfy the human need for communication, affection, and social belonging. We take into account both intelligent quotient and emotional quotient in system design, cast human–machine social chat as decision-making over Markov Decision Processes, and optimize XiaoIce for long-term user engagement, measured in expected Conversation-turns Per Session (CPS). We detail the system architecture and key components, including dialogue manager, core chat, skills, and an empathetic computing module. We show how XiaoIce dynamically recognizes human feelings and states, understands user intent, and responds to user needs throughout long conversations. Since the release in 2014, XiaoIce has communicated with over 660 million active users and succeeded in establishing long-term relationships with many of them. Analysis of large-scale online logs shows that XiaoIce has achieved an average CPS of 23, which is significantly higher than that of other chatbots and even human conversations.

pdf bib
PlotMachines: Outline-Conditioned Generation with Dynamic Plot State Tracking
Hannah Rashkin | Asli Celikyilmaz | Yejin Choi | Jianfeng Gao
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

We propose the task of outline-conditioned story generation: given an outline as a set of phrases that describe key characters and events to appear in a story, the task is to generate a coherent narrative that is consistent with the provided outline. This task is challenging as the input only provides a rough sketch of the plot, and thus, models need to generate a story by interweaving the key points provided in the outline. This requires the model to keep track of the dynamic states of the latent plot, conditioning on the input outline while generating the full story. We present PlotMachines, a neural narrative model that learns to transform an outline into a coherent story by tracking the dynamic plot states. In addition, we enrich PlotMachines with high-level discourse structure so that the model can learn different writing styles corresponding to different parts of the narrative. Comprehensive experiments over three fiction and non-fiction datasets demonstrate that large-scale language models, such as GPT-2 and Grover, despite their impressive generation performance, are not sufficient in generating coherent narratives for the given outline, and dynamic plot state tracking is important for composing narratives with tighter, more consistent plots.

pdf bib
Optimus: Organizing Sentences via Pre-trained Modeling of a Latent Space
Chunyuan Li | Xiang Gao | Yuan Li | Baolin Peng | Xiujun Li | Yizhe Zhang | Jianfeng Gao
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

When trained effectively, the Variational Autoencoder (VAE) can be both a powerful generative model and an effective representation learning framework for natural language. In this paper, we propose the first large-scale language VAE model Optimus (Organizing sentences via Pre-Trained Modeling of a Universal Space). A universal latent embedding space for sentences is first pre-trained on large text corpus, and then fine-tuned for various language generation and understanding tasks. Compared with GPT-2, Optimus enables guided language generation from an abstract level using the latent vectors. Compared with BERT, Optimus can generalize better on low-resource language understanding tasks due to the smooth latent space structure. Extensive experimental results on a wide range of language tasks demonstrate the effectiveness of Optimus. It achieves new state-of-the-art on VAE language modeling benchmarks.

pdf bib
Understanding the Difficulty of Training Transformers
Liyuan Liu | Xiaodong Liu | Jianfeng Gao | Weizhu Chen | Jiawei Han
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Transformers have proved effective in many NLP tasks. However, their training requires non-trivial efforts regarding carefully designing cutting-edge optimizers and learning rate schedulers (e.g., conventional SGD fails to train Transformers effectively). Our objective here is to understand __what complicates Transformer training__ from both empirical and theoretical perspectives. Our analysis reveals that unbalanced gradients are not the root cause of the instability of training. Instead, we identify an amplification effect that influences training substantially—for each layer in a multi-layer Transformer model, heavy dependency on its residual branch makes training unstable, since it amplifies small parameter perturbations (e.g., parameter updates) and results in significant disturbances in the model output. Yet we observe that a light dependency limits the model potential and leads to inferior trained models. Inspired by our analysis, we propose Admin (Adaptive model initialization) to stabilize the early stage’s training and unleash its full potential in the late stage. Extensive experiments show that Admin is more stable, converges faster, and leads to better performance

pdf bib
Is Your Goal-Oriented Dialog Model Performing Really Well? Empirical Analysis of System-wise Evaluation
Ryuichi Takanobu | Qi Zhu | Jinchao Li | Baolin Peng | Jianfeng Gao | Minlie Huang
Proceedings of the 21th Annual Meeting of the Special Interest Group on Discourse and Dialogue

There is a growing interest in developing goal-oriented dialog systems which serve users in accomplishing complex tasks through multi-turn conversations. Although many methods are devised to evaluate and improve the performance of individual dialog components, there is a lack of comprehensive empirical study on how different components contribute to the overall performance of a dialog system. In this paper, we perform a system-wise evaluation and present an empirical analysis on different types of dialog systems which are composed of different modules in different settings. Our results show that (1) a pipeline dialog system trained using fine-grained supervision signals at different component levels often obtains better performance than the systems that use joint or end-to-end models trained on coarse-grained labels, (2) component-wise, single-turn evaluation results are not always consistent with the overall performance of a dialog system, and (3) despite the discrepancy between simulators and human users, simulated evaluation is still a valid alternative to the costly human evaluation especially in the early stage of development.

2019

pdf bib
REO-Relevance, Extraness, Omission: A Fine-grained Evaluation for Image Captioning
Ming Jiang | Junjie Hu | Qiuyuan Huang | Lei Zhang | Jana Diesner | Jianfeng Gao
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Popular metrics used for evaluating image captioning systems, such as BLEU and CIDEr, provide a single score to gauge the system’s overall effectiveness. This score is often not informative enough to indicate what specific errors are made by a given system. In this study, we present a fine-grained evaluation method REO for automatically measuring the performance of image captioning systems. REO assesses the quality of captions from three perspectives: 1) Relevance to the ground truth, 2) Extraness of the content that is irrelevant to the ground truth, and 3) Omission of the elements in the images and human references. Experiments on three benchmark datasets demonstrate that our method achieves a higher consistency with human judgments and provides more intuitive evaluation results than alternative metrics.

pdf bib
Robust Navigation with Language Pretraining and Stochastic Sampling
Xiujun Li | Chunyuan Li | Qiaolin Xia | Yonatan Bisk | Asli Celikyilmaz | Jianfeng Gao | Noah A. Smith | Yejin Choi
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Core to the vision-and-language navigation (VLN) challenge is building robust instruction representations and action decoding schemes, which can generalize well to previously unseen instructions and environments. In this paper, we report two simple but highly effective methods to address these challenges and lead to a new state-of-the-art performance. First, we adapt large-scale pretrained language models to learn text representations that generalize better to previously unseen instructions. Second, we propose a stochastic sampling scheme to reduce the considerable gap between the expert actions in training and sampled actions in test, so that the agent can learn to correct its own mistakes during long sequential action decoding. Combining the two techniques, we achieve a new state of the art on the Room-to-Room benchmark with 6% absolute gain over the previous best result (47% -> 53%) on the Success Rate weighted by Path Length metric.

pdf bib
Structuring Latent Spaces for Stylized Response Generation
Xiang Gao | Yizhe Zhang | Sungjin Lee | Michel Galley | Chris Brockett | Jianfeng Gao | Bill Dolan
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Generating responses in a targeted style is a useful yet challenging task, especially in the absence of parallel data. With limited data, existing methods tend to generate responses that are either less stylized or less context-relevant. We propose StyleFusion, which bridges conversation modeling and non-parallel style transfer by sharing a structured latent space. This structure allows the system to generate stylized relevant responses by sampling in the neighborhood of the conversation model prediction, and continuously control the style level. We demonstrate this method using dialogues from Reddit data and two sets of sentences with distinct styles (arXiv and Sherlock Holmes novels). Automatic and human evaluation show that, without sacrificing appropriateness, the system generates responses of the targeted style and outperforms competitive baselines.

pdf bib
TIGEr: Text-to-Image Grounding for Image Caption Evaluation
Ming Jiang | Qiuyuan Huang | Lei Zhang | Xin Wang | Pengchuan Zhang | Zhe Gan | Jana Diesner | Jianfeng Gao
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

This paper presents a new metric called TIGEr for the automatic evaluation of image captioning systems. Popular metrics, such as BLEU and CIDEr, are based solely on text matching between reference captions and machine-generated captions, potentially leading to biased evaluations because references may not fully cover the image content and natural language is inherently ambiguous. Building upon a machine-learned text-image grounding model, TIGEr allows to evaluate caption quality not only based on how well a caption represents image content, but also on how well machine-generated captions match human-generated captions. Our empirical tests show that TIGEr has a higher consistency with human judgments than alternative existing metrics. We also comprehensively assess the metric’s effectiveness in caption evaluation by measuring the correlation between human judgments and metric scores.

pdf bib
Adversarial Domain Adaptation for Machine Reading Comprehension
Huazheng Wang | Zhe Gan | Xiaodong Liu | Jingjing Liu | Jianfeng Gao | Hongning Wang
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

In this paper, we focus on unsupervised domain adaptation for Machine Reading Comprehension (MRC), where the source domain has a large amount of labeled data, while only unlabeled passages are available in the target domain. To this end, we propose an Adversarial Domain Adaptation framework (AdaMRC), where (i) pseudo questions are first generated for unlabeled passages in the target domain, and then (ii) a domain classifier is incorporated into an MRC model to predict which domain a given passage-question pair comes from. The classifier and the passage-question encoder are jointly trained using adversarial learning to enforce domain-invariant representation learning. Comprehensive evaluations demonstrate that our approach (i) is generalizable to different MRC models and datasets, (ii) can be combined with pre-trained large-scale language models (such as ELMo and BERT), and (iii) can be extended to semi-supervised learning.

pdf bib
Implicit Deep Latent Variable Models for Text Generation
Le Fang | Chunyuan Li | Jianfeng Gao | Wen Dong | Changyou Chen
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Deep latent variable models (LVM) such as variational auto-encoder (VAE) have recently played an important role in text generation. One key factor is the exploitation of smooth latent structures to guide the generation. However, the representation power of VAEs is limited due to two reasons: (1) the Gaussian assumption is often made on the variational posteriors; and meanwhile (2) a notorious “posterior collapse” issue occurs. In this paper, we advocate sample-based representations of variational distributions for natural language, leading to implicit latent features, which can provide flexible representation power compared with Gaussian-based posteriors. We further develop an LVM to directly match the aggregated posterior to the prior. It can be viewed as a natural extension of VAEs with a regularization of maximizing mutual information, mitigating the “posterior collapse” issue. We demonstrate the effectiveness and versatility of our models in various text generation scenarios, including language modeling, unaligned style transfer, and dialog response generation. The source code to reproduce our experimental results is available on GitHub.

pdf bib
A Hybrid Neural Network Model for Commonsense Reasoning
Pengcheng He | Xiaodong Liu | Weizhu Chen | Jianfeng Gao
Proceedings of the First Workshop on Commonsense Inference in Natural Language Processing

This paper proposes a hybrid neural network(HNN) model for commonsense reasoning. An HNN consists of two component models, a masked language model and a semantic similarity model, which share a BERTbased contextual encoder but use different model-specific input and output layers. HNN obtains new state-of-the-art results on three classic commonsense reasoning tasks, pushing the WNLI benchmark to 89%, the Winograd Schema Challenge (WSC) benchmark to 75.1%, and the PDP60 benchmark to 90.0%. An ablation study shows that language models and semantic similarity models are complementary approaches to commonsense reasoning, and HNN effectively combines the strengths of both. The code and pre-trained models will be publicly available at https: //github.com/namisan/mt-dnn.

pdf bib
Towards Generating Long and Coherent Text with Multi-Level Latent Variable Models
Dinghan Shen | Asli Celikyilmaz | Yizhe Zhang | Liqun Chen | Xin Wang | Jianfeng Gao | Lawrence Carin
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Variational autoencoders (VAEs) have received much attention recently as an end-to-end architecture for text generation with latent variables. However, previous works typically focus on synthesizing relatively short sentences (up to 20 words), and the posterior collapse issue has been widely identified in text-VAEs. In this paper, we propose to leverage several multi-level structures to learn a VAE model for generating long, and coherent text. In particular, a hierarchy of stochastic layers between the encoder and decoder networks is employed to abstract more informative and semantic-rich latent codes. Besides, we utilize a multi-level decoder structure to capture the coherent long-term structure inherent in long-form texts, by generating intermediate sentence representations as high-level plan vectors. Extensive experimental results demonstrate that the proposed multi-level VAE model produces more coherent and less repetitive long text compared to baselines as well as can mitigate the posterior-collapse issue.

pdf bib
Budgeted Policy Learning for Task-Oriented Dialogue Systems
Zhirui Zhang | Xiujun Li | Jianfeng Gao | Enhong Chen
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

This paper presents a new approach that extends Deep Dyna-Q (DDQ) by incorporating a Budget-Conscious Scheduling (BCS) to best utilize a fixed, small amount of user interactions (budget) for learning task-oriented dialogue agents. BCS consists of (1) a Poisson-based global scheduler to allocate budget over different stages of training; (2) a controller to decide at each training step whether the agent is trained using real or simulated experiences; (3) a user goal sampling module to generate the experiences that are most effective for policy learning. Experiments on a movie-ticket booking task with simulated and real users show that our approach leads to significant improvements in success rate over the state-of-the-art baselines given the fixed budget.

pdf bib
Multi-Task Deep Neural Networks for Natural Language Understanding
Xiaodong Liu | Pengcheng He | Weizhu Chen | Jianfeng Gao
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

In this paper, we present a Multi-Task Deep Neural Network (MT-DNN) for learning representations across multiple natural language understanding (NLU) tasks. MT-DNN not only leverages large amounts of cross-task data, but also benefits from a regularization effect that leads to more general representations to help adapt to new tasks and domains. MT-DNN extends the model proposed in Liu et al. (2015) by incorporating a pre-trained bidirectional transformer language model, known as BERT (Devlin et al., 2018). MT-DNN obtains new state-of-the-art results on ten NLU tasks, including SNLI, SciTail, and eight out of nine GLUE tasks, pushing the GLUE benchmark to 82.7% (2.2% absolute improvement) as of February 25, 2019 on the latest GLUE test set. We also demonstrate using the SNLI and SciTail datasets that the representations learned by MT-DNN allow domain adaptation with substantially fewer in-domain labels than the pre-trained BERT representations. Our code and pre-trained models will be made publicly available.

pdf bib
Conversing by Reading: Contentful Neural Conversation with On-demand Machine Reading
Lianhui Qin | Michel Galley | Chris Brockett | Xiaodong Liu | Xiang Gao | Bill Dolan | Yejin Choi | Jianfeng Gao
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Although neural conversational models are effective in learning how to produce fluent responses, their primary challenge lies in knowing what to say to make the conversation contentful and non-vacuous. We present a new end-to-end approach to contentful neural conversation that jointly models response generation and on-demand machine reading. The key idea is to provide the conversation model with relevant long-form text on the fly as a source of external knowledge. The model performs QA-style reading comprehension on this text in response to each conversational turn, thereby allowing for more focused integration of external knowledge than has been possible in prior approaches. To support further research on knowledge-grounded conversation, we introduce a new large-scale conversation dataset grounded in external web pages (2.8M turns, 7.4M sentences of grounding). Both human evaluation and automated metrics show that our approach results in more contentful responses compared to a variety of previous methods, improving both the informativeness and diversity of generated output.

pdf bib
Multi-step Reasoning via Recurrent Dual Attention for Visual Dialog
Zhe Gan | Yu Cheng | Ahmed Kholy | Linjie Li | Jingjing Liu | Jianfeng Gao
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

This paper presents a new model for visual dialog, Recurrent Dual Attention Network (ReDAN), using multi-step reasoning to answer a series of questions about an image. In each question-answering turn of a dialog, ReDAN infers the answer progressively through multiple reasoning steps. In each step of the reasoning process, the semantic representation of the question is updated based on the image and the previous dialog history, and the recurrently-refined representation is used for further reasoning in the subsequent step. On the VisDial v1.0 dataset, the proposed ReDAN model achieves a new state-of-the-art of 64.47% NDCG score. Visualization on the reasoning process further demonstrates that ReDAN can locate context-relevant visual and textual clues via iterative refinement, which can lead to the correct answer step-by-step.

pdf bib
ConvLab: Multi-Domain End-to-End Dialog System Platform
Sungjin Lee | Qi Zhu | Ryuichi Takanobu | Zheng Zhang | Yaoqin Zhang | Xiang Li | Jinchao Li | Baolin Peng | Xiujun Li | Minlie Huang | Jianfeng Gao
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: System Demonstrations

We present ConvLab, an open-source multi-domain end-to-end dialog system platform, that enables researchers to quickly set up experiments with reusable components and compare a large set of different approaches, ranging from conventional pipeline systems to end-to-end neural models, in common environments. ConvLab offers a set of fully annotated datasets and associated pre-trained reference models. As a showcase, we extend the MultiWOZ dataset with user dialog act annotations to train all component models and demonstrate how ConvLab makes it easy and effortless to conduct complicated experiments in multi-domain end-to-end dialog settings.

pdf bib
Microsoft Icecaps: An Open-Source Toolkit for Conversation Modeling
Vighnesh Leonardo Shiv | Chris Quirk | Anshuman Suri | Xiang Gao | Khuram Shahid | Nithya Govindarajan | Yizhe Zhang | Jianfeng Gao | Michel Galley | Chris Brockett | Tulasi Menon | Bill Dolan
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: System Demonstrations

The Intelligent Conversation Engine: Code and Pre-trained Systems (Microsoft Icecaps) is an upcoming open-source natural language processing repository. Icecaps wraps TensorFlow functionality in a modular component-based architecture, presenting an intuitive and flexible paradigm for constructing sophisticated learning setups. Capabilities include multitask learning between models with shared parameters, upgraded language model decoding features, a range of built-in architectures, and a user-friendly data processing pipeline. The system is targeted toward conversational tasks, exploring diverse response generation, coherence, and knowledge grounding. Icecaps also provides pre-trained conversational models that can be either used directly or loaded for fine-tuning or bootstrapping other models; these models power an online demo of our framework.

pdf bib
Towards Coherent and Cohesive Long-form Text Generation
Woon Sang Cho | Pengchuan Zhang | Yizhe Zhang | Xiujun Li | Michel Galley | Chris Brockett | Mengdi Wang | Jianfeng Gao
Proceedings of the First Workshop on Narrative Understanding

Generating coherent and cohesive long-form texts is a challenging task. Previous works relied on large amounts of human-generated texts to train neural language models. However, few attempted to explicitly improve neural language models from the perspectives of coherence and cohesion. In this work, we propose a new neural language model that is equipped with two neural discriminators which provide feedback signals at the levels of sentence (cohesion) and paragraph (coherence). Our model is trained using a simple yet efficient variant of policy gradient, called ‘negative-critical sequence training’, which is proposed to eliminate the need of training a separate critic for estimating ‘baseline’. Results demonstrate the effectiveness of our approach, showing improvements over the strong baseline – recurrent attention-based bidirectional MLE-trained neural language model.

pdf bib
DoubleTransfer at MEDIQA 2019: Multi-Source Transfer Learning for Natural Language Understanding in the Medical Domain
Yichong Xu | Xiaodong Liu | Chunyuan Li | Hoifung Poon | Jianfeng Gao
Proceedings of the 18th BioNLP Workshop and Shared Task

This paper describes our competing system to enter the MEDIQA-2019 competition. We use a multi-source transfer learning approach to transfer the knowledge from MT-DNN and SciBERT to natural language understanding tasks in the medical domain. For transfer learning fine-tuning, we use multi-task learning on NLI, RQE and QA tasks on general and medical domains to improve performance. The proposed methods are proved effective for natural language understanding in the medical domain, and we rank the first place on the QA task.

pdf bib
Cyclical Annealing Schedule: A Simple Approach to Mitigating KL Vanishing
Hao Fu | Chunyuan Li | Xiaodong Liu | Jianfeng Gao | Asli Celikyilmaz | Lawrence Carin
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Variational autoencoders (VAE) with an auto-regressive decoder have been applied for many natural language processing (NLP) tasks. VAE objective consists of two terms, the KL regularization term and the reconstruction term, balanced by a weighting hyper-parameter 𝛽. One notorious training difficulty is that the KL term tends to vanish. In this paper we study different scheduling schemes for 𝛽, and show that KL vanishing is caused by the lack of good latent codes in training decoder at the beginning of optimization. To remedy the issue, we propose a cyclical annealing schedule, which simply repeats the process of increasing 𝛽 multiple times. This new procedure allows us to learn more meaningful latent codes progressively by leveraging the results of previous learning cycles as warm re-restart. The effectiveness of cyclical annealing schedule is validated on a broad range of NLP tasks, including language modeling, dialog response generation and semi-supervised text classification.

pdf bib
Unsupervised Deep Structured Semantic Models for Commonsense Reasoning
Shuohang Wang | Sheng Zhang | Yelong Shen | Xiaodong Liu | Jingjing Liu | Jianfeng Gao | Jing Jiang
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Commonsense reasoning is fundamental to natural language understanding. While traditional methods rely heavily on human-crafted features and knowledge bases, we explore learning commonsense knowledge from a large amount of raw text via unsupervised learning. We propose two neural network models based on the Deep Structured Semantic Models (DSSM) framework to tackle two classic commonsense reasoning tasks, Winograd Schema challenges (WSC) and Pronoun Disambiguation (PDP). Evaluation shows that the proposed models effectively capture contextual information in the sentence and co-reference information between pronouns and nouns, and achieve significant improvement over previous state-of-the-art approaches.

pdf bib
Jointly Optimizing Diversity and Relevance in Neural Response Generation
Xiang Gao | Sungjin Lee | Yizhe Zhang | Chris Brockett | Michel Galley | Jianfeng Gao | Bill Dolan
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Although recent neural conversation models have shown great potential, they often generate bland and generic responses. While various approaches have been explored to diversify the output of the conversation model, the improvement often comes at the cost of decreased relevance. In this paper, we propose a SpaceFusion model to jointly optimize diversity and relevance that essentially fuses the latent space of a sequence-to-sequence model and that of an autoencoder model by leveraging novel regularization terms. As a result, our approach induces a latent space in which the distance and direction from the predicted response vector roughly match the relevance and diversity, respectively. This property also lends itself well to an intuitive visualization of the latent space. Both automatic and human evaluation results demonstrate that the proposed approach brings significant improvement compared to strong baselines in both diversity and relevance.

pdf bib
Multi-task Learning with Sample Re-weighting for Machine Reading Comprehension
Yichong Xu | Xiaodong Liu | Yelong Shen | Jingjing Liu | Jianfeng Gao
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

We propose a multi-task learning framework to learn a joint Machine Reading Comprehension (MRC) model that can be applied to a wide range of MRC tasks in different domains. Inspired by recent ideas of data selection in machine translation, we develop a novel sample re-weighting scheme to assign sample-specific weights to the loss. Empirical study shows that our approach can be applied to many existing MRC models. Combined with contextual representations from pre-trained language models (such as ELMo), we achieve new state-of-the-art results on a set of MRC benchmark datasets. We release our code at https://github.com/xycforgithub/MultiTask-MRC.

2018

pdf bib
Discourse-Aware Neural Rewards for Coherent Text Generation
Antoine Bosselut | Asli Celikyilmaz | Xiaodong He | Jianfeng Gao | Po-Sen Huang | Yejin Choi
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)

In this paper, we investigate the use of discourse-aware rewards with reinforcement learning to guide a model to generate long, coherent text. In particular, we propose to learn neural rewards to model cross-sentence ordering as a means to approximate desired discourse structure. Empirical results demonstrate that a generator trained with the learned reward produces more coherent and less repetitive text than models trained with cross-entropy or with reinforcement learning with commonly used scores as rewards.

pdf bib
Subgoal Discovery for Hierarchical Dialogue Policy Learning
Da Tang | Xiujun Li | Jianfeng Gao | Chong Wang | Lihong Li | Tony Jebara
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Developing agents to engage in complex goal-oriented dialogues is challenging partly because the main learning signals are very sparse in long conversations. In this paper, we propose a divide-and-conquer approach that discovers and exploits the hidden structure of the task to enable efficient policy learning. First, given successful example dialogues, we propose the Subgoal Discovery Network (SDN) to divide a complex goal-oriented task into a set of simpler subgoals in an unsupervised fashion. We then use these subgoals to learn a multi-level policy by hierarchical reinforcement learning. We demonstrate our method by building a dialogue agent for the composite task of travel planning. Experiments with simulated and real users show that our approach performs competitively against a state-of-the-art method that requires human-defined subgoals. Moreover, we show that the learned subgoals are often human comprehensible.

pdf bib
Discriminative Deep Dyna-Q: Robust Planning for Dialogue Policy Learning
Shang-Yu Su | Xiujun Li | Jianfeng Gao | Jingjing Liu | Yun-Nung Chen
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

This paper presents a Discriminative Deep Dyna-Q (D3Q) approach to improving the effectiveness and robustness of Deep Dyna-Q (DDQ), a recently proposed framework that extends the Dyna-Q algorithm to integrate planning for task-completion dialogue policy learning. To obviate DDQ’s high dependency on the quality of simulated experiences, we incorporate an RNN-based discriminator in D3Q to differentiate simulated experience from real user experience in order to control the quality of training data. Experiments show that D3Q significantly outperforms DDQ by controlling the quality of simulated experience used for planning. The effectiveness and robustness of D3Q is further demonstrated in a domain extension setting, where the agent’s capability of adapting to a changing environment is tested.

pdf bib
Stochastic Answer Networks for Machine Reading Comprehension
Xiaodong Liu | Yelong Shen | Kevin Duh | Jianfeng Gao
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

We propose a simple yet robust stochastic answer network (SAN) that simulates multi-step reasoning in machine reading comprehension. Compared to previous work such as ReasoNet which used reinforcement learning to determine the number of steps, the unique feature is the use of a kind of stochastic prediction dropout on the answer module (final layer) of the neural network during the training. We show that this simple trick improves robustness and achieves results competitive to the state-of-the-art on the Stanford Question Answering Dataset (SQuAD), the Adversarial SQuAD, and the Microsoft MAchine Reading COmprehension Dataset (MS MARCO).

pdf bib
Deep Dyna-Q: Integrating Planning for Task-Completion Dialogue Policy Learning
Baolin Peng | Xiujun Li | Jianfeng Gao | Jingjing Liu | Kam-Fai Wong
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Training a task-completion dialogue agent via reinforcement learning (RL) is costly because it requires many interactions with real users. One common alternative is to use a user simulator. However, a user simulator usually lacks the language complexity of human interlocutors and the biases in its design may tend to degrade the agent. To address these issues, we present Deep Dyna-Q, which to our knowledge is the first deep RL framework that integrates planning for task-completion dialogue policy learning. We incorporate into the dialogue agent a model of the environment, referred to as the world model, to mimic real user response and generate simulated experience. During dialogue policy learning, the world model is constantly updated with real user experience to approach real user behavior, and in turn, the dialogue agent is optimized using both real experience and simulated experience. The effectiveness of our approach is demonstrated on a movie-ticket booking task in both simulated and human-in-the-loop settings.

pdf bib
Neural Approaches to Conversational AI
Jianfeng Gao | Michel Galley | Lihong Li
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts

This tutorial surveys neural approaches to conversational AI that were developed in the last few years. We group conversational systems into three categories: (1) question answering agents, (2) task-oriented dialogue agents, and (3) social bots. For each category, we present a review of state-of-the-art neural approaches, draw the connection between neural approaches and traditional symbolic approaches, and discuss the progress we have made and challenges we are facing, using specific systems and models as case studies.

2017

pdf bib
Towards End-to-End Reinforcement Learning of Dialogue Agents for Information Access
Bhuwan Dhingra | Lihong Li | Xiujun Li | Jianfeng Gao | Yun-Nung Chen | Faisal Ahmed | Li Deng
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

This paper proposes KB-InfoBot - a multi-turn dialogue agent which helps users search Knowledge Bases (KBs) without composing complicated queries. Such goal-oriented dialogue agents typically need to interact with an external database to access real-world knowledge. Previous systems achieved this by issuing a symbolic query to the KB to retrieve entries based on their attributes. However, such symbolic operations break the differentiability of the system and prevent end-to-end training of neural dialogue agents. In this paper, we address this limitation by replacing symbolic queries with an induced “soft” posterior distribution over the KB that indicates which entities the user is interested in. Integrating the soft retrieval process with a reinforcement learner leads to higher task success rate and reward in both simulations and against real users. We also present a fully neural end-to-end agent, trained entirely from user feedback, and discuss its application towards personalized dialogue agents.

pdf bib
A Nested Attention Neural Hybrid Model for Grammatical Error Correction
Jianshu Ji | Qinlong Wang | Kristina Toutanova | Yongen Gong | Steven Truong | Jianfeng Gao
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Grammatical error correction (GEC) systems strive to correct both global errors inword order and usage, and local errors inspelling and inflection. Further developing upon recent work on neural machine translation, we propose a new hybrid neural model with nested attention layers for GEC.Experiments show that the new model can effectively correct errors of both types by incorporating word and character-level information, and that the model significantly outperforms previous neural models for GEC as measured on the standard CoNLL-14 benchmark dataset.Further analysis also shows that the superiority of the proposed model can be largely attributed to the use of the nested attention mechanism, which has proven particularly effective incorrecting local errors that involve small edits in orthography.

pdf bib
Image-Grounded Conversations: Multimodal Context for Natural Question and Response Generation
Nasrin Mostafazadeh | Chris Brockett | Bill Dolan | Michel Galley | Jianfeng Gao | Georgios Spithourakis | Lucy Vanderwende
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

The popularity of image sharing on social media and the engagement it creates between users reflect the important role that visual context plays in everyday conversations. We present a novel task, Image Grounded Conversations (IGC), in which natural-sounding conversations are generated about a shared image. To benchmark progress, we introduce a new multiple reference dataset of crowd-sourced, event-centric conversations on images. IGC falls on the continuum between chit-chat and goal-directed conversation models, where visual grounding constrains the topic of conversation to event-driven utterances. Experiments with models trained on social media data show that the combination of visual and textual context enhances the quality of generated conversational turns. In human evaluation, the gap between human performance and that of both neural and retrieval architectures suggests that multi-modal IGC presents an interesting challenge for dialog research.

pdf bib
Multi-Task Learning for Speaker-Role Adaptation in Neural Conversation Models
Yi Luan | Chris Brockett | Bill Dolan | Jianfeng Gao | Michel Galley
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Building a persona-based conversation agent is challenging owing to the lack of large amounts of speaker-specific conversation data for model training. This paper addresses the problem by proposing a multi-task learning approach to training neural conversation models that leverages both conversation data across speakers and other types of data pertaining to the speaker and speaker roles to be modeled. Experiments show that our approach leads to significant improvements over baseline model quality, generating responses that capture more precisely speakers’ traits and speaking styles. The model offers the benefits of being algorithmically simple and easy to implement, and not relying on large quantities of data representing specific individual speakers.

pdf bib
End-to-End Task-Completion Neural Dialogue Systems
Xiujun Li | Yun-Nung Chen | Lihong Li | Jianfeng Gao | Asli Celikyilmaz
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

One of the major drawbacks of modularized task-completion dialogue systems is that each module is trained individually, which presents several challenges. For example, downstream modules are affected by earlier modules, and the performance of the entire system is not robust to the accumulated errors. This paper presents a novel end-to-end learning framework for task-completion dialogue systems to tackle such issues.Our neural dialogue system can directly interact with a structured database to assist users in accessing information and accomplishing certain tasks. The reinforcement learning based dialogue manager offers robust capabilities to handle noises caused by other components of the dialogue system. Our experiments in a movie-ticket booking domain show that our end-to-end system not only outperforms modularized dialogue system baselines for both objective and subjective evaluation, but also is robust to noises as demonstrated by several systematic experiments with different error granularity and rates specific to the language understanding module.

pdf bib
An Empirical Analysis of Multiple-Turn Reasoning Strategies in Reading Comprehension Tasks
Yelong Shen | Xiaodong Liu | Kevin Duh | Jianfeng Gao
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Reading comprehension (RC) is a challenging task that requires synthesis of information across sentences and multiple turns of reasoning. Using a state-of-the-art RC model, we empirically investigate the performance of single-turn and multiple-turn reasoning on the SQuAD and MS MARCO datasets. The RC model is an end-to-end neural network with iterative attention, and uses reinforcement learning to dynamically control the number of turns. We find that multiple-turn reasoning outperforms single-turn reasoning for all question and answer types; further, we observe that enabling a flexible number of turns generally improves upon a fixed multiple-turn strategy. %across all question types, and is particularly beneficial to questions with lengthy, descriptive answers. We achieve results competitive to the state-of-the-art on these two datasets.

pdf bib
Open-Domain Neural Dialogue Systems
Yun-Nung Chen | Jianfeng Gao
Proceedings of the IJCNLP 2017, Tutorial Abstracts

In the past decade, spoken dialogue systems have been the most prominent component in today’s personal assistants. A lot of devices have incorporated dialogue system modules, which allow users to speak naturally in order to finish tasks more efficiently. The traditional conversational systems have rather complex and/or modular pipelines. The advance of deep learning technologies has recently risen the applications of neural models to dialogue modeling. Nevertheless, applying deep learning technologies for building robust and scalable dialogue systems is still a challenging task and an open research area as it requires deeper understanding of the classic pipelines as well as detailed knowledge on the benchmark of the models of the prior work and the recent state-of-the-art work. Therefore, this tutorial is designed to focus on an overview of the dialogue system development while describing most recent research for building task-oriented and chit-chat dialogue systems, and summarizing the challenges. We target the audience of students and practitioners who have some deep learning background, who want to get more familiar with conversational dialogue systems.

pdf bib
Modeling Large-Scale Structured Relationships with Shared Memory for Knowledge Base Completion
Yelong Shen | Po-Sen Huang | Ming-Wei Chang | Jianfeng Gao
Proceedings of the 2nd Workshop on Representation Learning for NLP

Recent studies on knowledge base completion, the task of recovering missing relationships based on recorded relations, demonstrate the importance of learning embeddings from multi-step relations. However, due to the size of knowledge bases, learning multi-step relations directly on top of observed triplets could be costly. Hence, a manually designed procedure is often used when training the models. In this paper, we propose Implicit ReasoNets (IRNs), which is designed to perform multi-step inference implicitly through a controller and shared memory. Without a human-designed inference procedure, IRNs use training data to learn to perform multi-step inference in an embedding neural space through the shared memory and controller. While the inference procedure does not explicitly operate on top of observed triplets, our proposed model outperforms all previous approaches on the popular FB15k benchmark by more than 5.7%.

pdf bib
Composite Task-Completion Dialogue Policy Learning via Hierarchical Deep Reinforcement Learning
Baolin Peng | Xiujun Li | Lihong Li | Jianfeng Gao | Asli Celikyilmaz | Sungjin Lee | Kam-Fai Wong
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

Building a dialogue agent to fulfill complex tasks, such as travel planning, is challenging because the agent has to learn to collectively complete multiple subtasks. For example, the agent needs to reserve a hotel and book a flight so that there leaves enough time for commute between arrival and hotel check-in. This paper addresses this challenge by formulating the task in the mathematical framework of options over Markov Decision Processes (MDPs), and proposing a hierarchical deep reinforcement learning approach to learning a dialogue manager that operates at different temporal scales. The dialogue manager consists of: (1) a top-level dialogue policy that selects among subtasks or options, (2) a low-level dialogue policy that selects primitive actions to complete the subtask given by the top-level policy, and (3) a global state tracker that helps ensure all cross-subtask constraints be satisfied. Experiments on a travel planning task with simulated and real users show that our approach leads to significant improvements over three baselines, two based on handcrafted rules and the other based on flat deep reinforcement learning.

2016

pdf bib
Deep Reinforcement Learning for Dialogue Generation
Jiwei Li | Will Monroe | Alan Ritter | Dan Jurafsky | Michel Galley | Jianfeng Gao
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf bib
Deep Reinforcement Learning with a Combinatorial Action Space for Predicting Popular Reddit Threads
Ji He | Mari Ostendorf | Xiaodong He | Jianshu Chen | Jianfeng Gao | Lihong Li | Li Deng
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf bib
Bi-directional Attention with Agreement for Dependency Parsing
Hao Cheng | Hao Fang | Xiaodong He | Jianfeng Gao | Li Deng
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf bib
A Diversity-Promoting Objective Function for Neural Conversation Models
Jiwei Li | Michel Galley | Chris Brockett | Jianfeng Gao | Bill Dolan
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
A Persona-Based Neural Conversation Model
Jiwei Li | Michel Galley | Chris Brockett | Georgios Spithourakis | Jianfeng Gao | Bill Dolan
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Deep Reinforcement Learning with a Natural Language Action Space
Ji He | Jianshu Chen | Xiaodong He | Jianfeng Gao | Lihong Li | Li Deng | Mari Ostendorf
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

2015

pdf bib
A Neural Network Approach to Context-Sensitive Generation of Conversational Responses
Alessandro Sordoni | Michel Galley | Michael Auli | Chris Brockett | Yangfeng Ji | Margaret Mitchell | Jian-Yun Nie | Jianfeng Gao | Bill Dolan
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Representation Learning Using Multi-Task Deep Neural Networks for Semantic Classification and Information Retrieval
Xiaodong Liu | Jianfeng Gao | Xiaodong He | Li Deng | Kevin Duh | Ye-yi Wang
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Deep Learning and Continuous Representations for Natural Language Processing
Wen-tau Yih | Xiaodong He | Jianfeng Gao
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Tutorial Abstracts

pdf bib
Semantic Parsing via Staged Query Graph Generation: Question Answering with Knowledge Base
Wen-tau Yih | Ming-Wei Chang | Xiaodong He | Jianfeng Gao
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

pdf bib
deltaBLEU: A Discriminative Metric for Generation Tasks with Intrinsically Diverse Targets
Michel Galley | Chris Brockett | Alessandro Sordoni | Yangfeng Ji | Michael Auli | Chris Quirk | Margaret Mitchell | Jianfeng Gao | Bill Dolan
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

2014

pdf bib
Learning Continuous Phrase Representations for Translation Modeling
Jianfeng Gao | Xiaodong He | Wen-tau Yih | Li Deng
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Decoder Integration and Expected BLEU Training for Recurrent Neural Network Language Models
Michael Auli | Jianfeng Gao
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
Minimum Translation Modeling with Recurrent Neural Networks
Yuening Hu | Michael Auli | Qin Gao | Jianfeng Gao
Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics

pdf bib
Modeling Interestingness with Deep Neural Networks
Jianfeng Gao | Patrick Pantel | Michael Gamon | Xiaodong He | Li Deng
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

pdf bib
Large-scale Expected BLEU Training of Phrase-based Reordering Models
Michael Auli | Michel Galley | Jianfeng Gao
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

2013

pdf bib
Beyond Left-to-Right: Multiple Decomposition Structures for SMT
Hui Zhang | Kristina Toutanova | Chris Quirk | Jianfeng Gao
Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Training MRF-Based Phrase Translation Models using Gradient Ascent
Jianfeng Gao | Xiaodong He
Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

2012

pdf bib
MSR SPLAT, a language analysis toolkit
Chris Quirk | Pallavi Choudhury | Jianfeng Gao | Hisami Suzuki | Kristina Toutanova | Michael Gamon | Wen-tau Yih | Colin Cherry | Lucy Vanderwende
Proceedings of the Demonstration Session at the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
A Unified Approach to Transliteration-based Text Input with Online Spelling Correction
Hisami Suzuki | Jianfeng Gao
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

pdf bib
Learning Lexicon Models from Search Logs for Query Expansion
Jianfeng Gao | Shasha Xie | Xiaodong He | Alnur Ali
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

2011

pdf bib
Domain Adaptation via Pseudo In-Domain Data Selection
Amittai Axelrod | Xiaodong He | Jianfeng Gao
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing

2010

pdf bib
A Large Scale Ranker-Based System for Search Query Spelling Correction
Jianfeng Gao | Xiaolong Li | Daniel Micol | Chris Quirk | Xu Sun
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)

pdf bib
A comparison of unsupervised methods for Part-of-Speech Tagging in Chinese
Alex Cheng | Fei Xia | Jianfeng Gao
Coling 2010: Posters

pdf bib
Learning Phrase-Based Spelling Error Models from Clickthrough Data
Xu Sun | Jianfeng Gao | Daniel Micol | Chris Quirk
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics

2009

pdf bib
Model Adaptation via Model Interpolation and Boosting for Web Search Ranking
Jianfeng Gao | Qiang Wu | Chris Burges | Krysta Svore | Yi Su | Nazan Khan | Shalin Shah | Hongyan Zhou
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

pdf bib
Discovery of Term Variation in Japanese Web Search Queries
Hisami Suzuki | Xiao Li | Jianfeng Gao
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

2008

pdf bib
Using Contextual Speller Techniques and Language Modeling for ESL Error Correction
Michael Gamon | Jianfeng Gao | Chris Brockett | Alexandre Klementiev | William B. Dolan | Dmitriy Belenko | Lucy Vanderwende
Proceedings of the Third International Joint Conference on Natural Language Processing: Volume-I

pdf bib
A Web-based English Proofing System for English as a Second Language Users
Xing Yi | Jianfeng Gao | William B. Dolan
Proceedings of the Third International Joint Conference on Natural Language Processing: Volume-II

pdf bib
Bayesian Semi-Supervised Chinese Word Segmentation for Statistical Machine Translation
Jia Xu | Jianfeng Gao | Kristina Toutanova | Hermann Ney
Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)

pdf bib
Indirect-HMM-based Hypothesis Alignment for Combining Outputs from Machine Translation Systems
Xiaodong He | Mei Yang | Jianfeng Gao | Patrick Nguyen | Robert Moore
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing

pdf bib
A comparison of Bayesian estimators for unsupervised Hidden Markov Model POS taggers
Jianfeng Gao | Mark Johnson
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing

2007

pdf bib
A Comparative Study of Parameter Estimation Methods for Statistical Natural Language Processing
Jianfeng Gao | Galen Andrew | Mark Johnson | Kristina Toutanova
Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics

pdf bib
Compressing Trigram Language Models With Golomb Coding
Kenneth Church | Ted Hart | Jianfeng Gao
Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)

2006

pdf bib
Approximation Lasso Methods for Language Modeling
Jianfeng Gao | Hisami Suzuki | Bin Yu
Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics

pdf bib
A DOM Tree Alignment Model for Mining Parallel Data from the Web
Lei Shi | Cheng Niu | Ming Zhou | Jianfeng Gao
Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics

pdf bib
An Information-Theoretic Approach to Automatic Evaluation of Summaries
Chin-Yew Lin | Guihong Cao | Jianfeng Gao | Jian-Yun Nie
Proceedings of the Human Language Technology Conference of the NAACL, Main Conference

2005

pdf bib
Chinese Word Segmentation and Named Entity Recognition: A Pragmatic Approach
Jianfeng Gao | Mu Li | Andi Wu | Chang-Ning Huang
Computational Linguistics, Volume 31, Number 4, December 2005

pdf bib
An Empirical Study on Language Model Adaptation Using a Metric of Domain Similarity
Wei Yuan | Jianfeng Gao | Hisami Suzuki
Second International Joint Conference on Natural Language Processing: Full Papers

pdf bib
Transformation Based Chinese Entity Detection and Tracking
Yaqian Zhou | Changning Huang | Jianfeng Gao | Lide Wu
Companion Volume to the Proceedings of Conference including Posters/Demos and tutorial abstracts

pdf bib
Minimum Sample Risk Methods for Language Modeling
Jianfeng Gao | Hao Yu | Wei Yuan | Peng Xu
Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing

pdf bib
A Comparative Study on Language Model Adaptation Techniques Using New Evaluation Metrics
Hisami Suzuki | Jianfeng Gao
Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing

2004

pdf bib
Adaptive Chinese Word Segmentation
Jianfeng Gao | Andi Wu | Mu Li | Chang-Ning Huang | Hongqiao Li | Xinsong Xia | Haowei Qin
Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04)

pdf bib
Chinese Chunking with Another Type of Spec
Hongqiao Li | Changning Huang | Jianfeng Gao | Xiaozhong Fan
Proceedings of the Third SIGHAN Workshop on Chinese Language Processing

pdf bib
A Semi-Supervised Approach to Build Annotated Corpus for Chinese Named Entity Recognition
Xiaoshan Fang | Jianfeng Gao | Huanye Sheng
Proceedings of the Third SIGHAN Workshop on Chinese Language Processing

2003

pdf bib
Unsupervised Training for Overlapping Ambiguity Resolution in Chinese Word Segmentation
Mu Li | Jianfeng Gao | Chang-Ning Huang | Jianfeng Li
Proceedings of the Second SIGHAN Workshop on Chinese Language Processing

pdf bib
Single Character Chinese Named Entity Recognition
Xiaodan Zhu | Mu Li | Jianfeng Gao | Chang-Ning Huang
Proceedings of the Second SIGHAN Workshop on Chinese Language Processing

pdf bib
Improved Source-Channel Models for Chinese Word Segmentation
Jianfeng Gao | Mu Li | Chang-Ning Huang
Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics

pdf bib
Unsupervised Learning of Dependency Structure for Language Modeling
Jianfeng Gao | Hisami Suzuki
Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics

pdf bib
A Class-based Language Model Approach to Chinese Named Entity Identification
Jian Sun | Ming Zhou | Jianfeng Gao
International Journal of Computational Linguistics & Chinese Language Processing, Volume 8, Number 2, August 2003

2002

pdf bib
Exploiting Headword Dependency and Predictive Clustering for Language Modeling
Jianfeng Gao | Hisami Suzuki | Yang Wen
Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing (EMNLP 2002)

pdf bib
Finding the Better Indexing units for Chinese Information Retrieval
Hongzhao He | Jianfeng Gao | Pilian He | Changning Huang
COLING-02: The First SIGHAN Workshop on Chinese Language Processing

pdf bib
Improving Language Model Size Reduction using Better Pruning Criteria
Jianfeng Gao | Min Zhang
Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics

pdf bib
Exploring Asymmetric Clustering for Statistical Language Modeling
Jianfeng Gao | Joshua Goodman | Guihong Cao | Hang Li
Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics

pdf bib
Chinese Named Entity Identification Using Class-based Language Model
Jian Sun | Jianfeng Gao | Lei Zhang | Ming Zhou | Changning Huang
COLING 2002: The 19th International Conference on Computational Linguistics

2001

pdf bib
The Use of Clustering Techniques for Language Modeling V Application to Asian Language
Jianfeng Gao | Joshua T. Goodman | Jiangbo Miao
International Journal of Computational Linguistics & Chinese Language Processing, Volume 6, Number 1, February 2001: Special Issue on Natural Language Processing Researches in MSRA

pdf bib
Improving the Effectiveness of Information Retrieval with Clustering and Fusion
Jian Zhang | Jianfeng Gao | Ming Zhou | Jiaxing Wang
International Journal of Computational Linguistics & Chinese Language Processing, Volume 6, Number 1, February 2001: Special Issue on Natural Language Processing Researches in MSRA

2000

pdf bib
PENS: A Machine-aided English Writing System for Chinese Users
Ting Liu | Ming Zhou | Jianfeng Gao | Endong Xun | Changning Huang
Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics

pdf bib
Distribution-Based Pruning of Backoff Language Models
Jianfeng Gao | Kai-Fu Lee
Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics

pdf bib
Extraction of Chinese Compound Words - An Experimental Study on a Very Large Corpus
Jian Zhang | Jianfeng Gao | Ming Zhou
Second Chinese Language Processing Workshop

Search
Co-authors