Hao Cheng


2020

pdf bib
Probabilistic Assumptions Matter: Improved Models for Distantly-Supervised Document-Level Question Answering
Hao Cheng | Ming-Wei Chang | Kenton Lee | Kristina Toutanova
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

We address the problem of extractive question answering using document-level distant super-vision, pairing questions and relevant documents with answer strings. We compare previously used probability space and distant supervision assumptions (assumptions on the correspondence between the weak answer string labels and possible answer mention spans). We show that these assumptions interact, and that different configurations provide complementary benefits. We demonstrate that a multi-objective model can efficiently combine the advantages of multiple assumptions and outperform the best individual formulation. Our approach outperforms previous state-of-the-art models by 4.3 points in F1 on TriviaQA-Wiki and 1.7 points in Rouge-L on NarrativeQA summaries.

pdf bib
The Microsoft Toolkit of Multi-Task Deep Neural Networks for Natural Language Understanding
Xiaodong Liu | Yu Wang | Jianshu Ji | Hao Cheng | Xueyun Zhu | Emmanuel Awa | Pengcheng He | Weizhu Chen | Hoifung Poon | Guihong Cao | Jianfeng Gao
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations

We present MT-DNN, an open-source natural language understanding (NLU) toolkit that makes it easy for researchers and developers to train customized deep learning models. Built upon PyTorch and Transformers, MT-DNN is designed to facilitate rapid customization for a broad spectrum of NLU tasks, using a variety of objectives (classification, regression, structured prediction) and text encoders (e.g., RNNs, BERT, RoBERTa, UniLM). A unique feature of MT-DNN is its built-in support for robust and transferable learning using the adversarial multi-task learning paradigm. To enable efficient production deployment, MT-DNN supports multi-task knowledge distillation, which can substantially compress a deep neural model without significant performance drop. We demonstrate the effectiveness of MT-DNN on a wide range of NLU applications across general and biomedical domains. The software and pre-trained models will be publicly available at https://github.com/namisan/mt-dnn.

pdf bib
Train Once, and Decode As You Like
Chao Tian | Yifei Wang | Hao Cheng | Yijiang Lian | Zhihua Zhang
Proceedings of the 28th International Conference on Computational Linguistics

In this paper we propose a unified approach for supporting different generation manners of machine translation, including autoregressive, semi-autoregressive, and refinement-based non-autoregressive models. Our approach works by repeatedly selecting positions and generating tokens at these selected positions. After being trained once, our approach achieves better or competitive translation performance compared with some strong task-specific baseline models in all the settings. This generalization ability benefits mainly from the new training objective that we propose. We validate our approach on the WMT’14 English-German and IWSLT’14 German-English translation tasks. The experimental results are encouraging.

2019

pdf bib
A Dynamic Speaker Model for Conversational Interactions
Hao Cheng | Hao Fang | Mari Ostendorf
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Individual differences in speakers are reflected in their language use as well as in their interests and opinions. Characterizing these differences can be useful in human-computer interaction, as well as analysis of human-human conversations. In this work, we introduce a neural model for learning a dynamically updated speaker embedding in a conversational context. Initial model training is unsupervised, using context-sensitive language generation as an objective, with the context being the conversation history. Further fine-tuning can leverage task-dependent supervised training. The learned neural representation of speakers is shown to be useful for content ranking in a socialbot and dialog act prediction in human-human conversations.

2018

pdf bib
Sounding Board: A User-Centric and Content-Driven Social Chatbot
Hao Fang | Hao Cheng | Maarten Sap | Elizabeth Clark | Ari Holtzman | Yejin Choi | Noah A. Smith | Mari Ostendorf
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations

We present Sounding Board, a social chatbot that won the 2017 Amazon Alexa Prize. The system architecture consists of several components including spoken language processing, dialogue management, language generation, and content management, with emphasis on user-centric and content-driven design. We also share insights gained from large-scale online logs based on 160,000 conversations with real-world users.

2017

pdf bib
A Factored Neural Network Model for Characterizing Online Discussions in Vector Space
Hao Cheng | Hao Fang | Mari Ostendorf
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

We develop a novel factored neural model that learns comment embeddings in an unsupervised way leveraging the structure of distributional context in online discussion forums. The model links different context with related language factors in the embedding space, providing a way to interpret the factored embeddings. Evaluated on a community endorsement prediction task using a large collection of topic-varying Reddit discussions, the factored embeddings consistently achieve improvement over other text representations. Qualitative analysis shows that the model captures community style and topic, as well as response trigger patterns.

2016

pdf bib
Bi-directional Attention with Agreement for Dependency Parsing
Hao Cheng | Hao Fang | Xiaodong He | Jianfeng Gao | Li Deng
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf bib
Learning Latent Local Conversation Modes for Predicting Comment Endorsement in Online Discussions
Hao Fang | Hao Cheng | Mari Ostendorf
Proceedings of The Fourth International Workshop on Natural Language Processing for Social Media

2015

pdf bib
Open-Domain Name Error Detection using a Multi-Task RNN
Hao Cheng | Hao Fang | Mari Ostendorf
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
Language Models for Image Captioning: The Quirks and What Works
Jacob Devlin | Hao Cheng | Hao Fang | Saurabh Gupta | Li Deng | Xiaodong He | Geoffrey Zweig | Margaret Mitchell
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)