Cheng Niu


2020

pdf bib
A Contextual Hierarchical Attention Network with Adaptive Objective for Dialogue State Tracking
Yong Shan | Zekang Li | Jinchao Zhang | Fandong Meng | Yang Feng | Cheng Niu | Jie Zhou
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Recent studies in dialogue state tracking (DST) leverage historical information to determine states which are generally represented as slot-value pairs. However, most of them have limitations to efficiently exploit relevant context due to the lack of a powerful mechanism for modeling interactions between the slot and the dialogue history. Besides, existing methods usually ignore the slot imbalance problem and treat all slots indiscriminately, which limits the learning of hard slots and eventually hurts overall performance. In this paper, we propose to enhance the DST through employing a contextual hierarchical attention network to not only discern relevant information at both word level and turn level but also learn contextual representations. We further propose an adaptive objective to alleviate the slot imbalance problem by dynamically adjust weights of different slots during training. Experimental results show that our approach reaches 52.68% and 58.55% joint accuracy on MultiWOZ 2.0 and MultiWOZ 2.1 datasets respectively and achieves new state-of-the-art performance with considerable improvements (+1.24% and +5.98%).

pdf bib
Diversifying Dialogue Generation with Non-Conversational Text
Hui Su | Xiaoyu Shen | Sanqiang Zhao | Zhou Xiao | Pengwei Hu | Randy Zhong | Cheng Niu | Jie Zhou
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Neural network-based sequence-to-sequence (seq2seq) models strongly suffer from the low-diversity problem when it comes to open-domain dialogue generation. As bland and generic utterances usually dominate the frequency distribution in our daily chitchat, avoiding them to generate more interesting responses requires complex data filtering, sampling techniques or modifying the training objective. In this paper, we propose a new perspective to diversify dialogue generation by leveraging non-conversational text. Compared with bilateral conversations, non-conversational text are easier to obtain, more diverse and cover a much broader range of topics. We collect a large-scale non-conversational corpus from multi sources including forum comments, idioms and book snippets. We further present a training paradigm to effectively incorporate these text via iterative back translation. The resulting model is tested on two conversational datasets from different domains and is shown to produce significantly more diverse responses without sacrificing the relevance with context.

pdf bib
Neural Data-to-Text Generation via Jointly Learning the Segmentation and Correspondence
Xiaoyu Shen | Ernie Chang | Hui Su | Cheng Niu | Dietrich Klakow
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

The neural attention model has achieved great success in data-to-text generation tasks. Though usually excelling at producing fluent text, it suffers from the problem of information missing, repetition and “hallucination”. Due to the black-box nature of the neural attention architecture, avoiding these problems in a systematic way is non-trivial. To address this concern, we propose to explicitly segment target text into fragment units and align them with their data correspondences. The segmentation and correspondence are jointly learned as latent variables without any human annotations. We further impose a soft statistical constraint to regularize the segmental granularity. The resulting architecture maintains the same expressive power as neural attention models, while being able to generate fully interpretable outputs with several times less computational cost. On both E2E and WebNLG benchmarks, we show the proposed model consistently outperforms its neural attention counterparts.

pdf bib
Contrastive Zero-Shot Learning for Cross-Domain Slot Filling with Adversarial Attack
Keqing He | Jinchao Zhang | Yuanmeng Yan | Weiran Xu | Cheng Niu | Jie Zhou
Proceedings of the 28th International Conference on Computational Linguistics

Zero-shot slot filling has widely arisen to cope with data scarcity in target domains. However, previous approaches often ignore constraints between slot value representation and related slot description representation in the latent space and lack enough model robustness. In this paper, we propose a Contrastive Zero-Shot Learning with Adversarial Attack (CZSL-Adv) method for the cross-domain slot filling. The contrastive loss aims to map slot value contextual representations to the corresponding slot description representations. And we introduce an adversarial attack training strategy to improve model robustness. Experimental results show that our model significantly outperforms state-of-the-art baselines under both zero-shot and few-shot settings.

pdf bib
MovieChats: Chat like Humans in a Closed Domain
Hui Su | Xiaoyu Shen | Zhou Xiao | Zheng Zhang | Ernie Chang | Cheng Zhang | Cheng Niu | Jie Zhou
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Being able to perform in-depth chat with humans in a closed domain is a precondition before an open-domain chatbot can be ever claimed. In this work, we take a close look at the movie domain and present a large-scale high-quality corpus with fine-grained annotations in hope of pushing the limit of movie-domain chatbots. We propose a unified, readily scalable neural approach which reconciles all subtasks like intent prediction and knowledge retrieval. The model is first pretrained on the huge general-domain data, then finetuned on our corpus. We show this simple neural approach trained on high-quality data is able to outperform commercial systems replying on complex rules. On both the static and interactive tests, we find responses generated by our system exhibits remarkably good engagement and sensibleness close to human-written ones. We further analyze the limits of our work and point out potential directions for future work

2019

pdf bib
Answer-Supervised Question Reformulation for Enhancing Conversational Machine Comprehension
Qian Li | Hui Su | Cheng Niu | Daling Wang | Zekang Li | Shi Feng | Yifei Zhang
Proceedings of the 2nd Workshop on Machine Reading for Question Answering

In conversational machine comprehension, it has become one of the research hotspots integrating conversational history information through question reformulation for obtaining better answers. However, the existing question reformulation models are trained only using supervised question labels annotated by annotators without considering any feedback information from answers. In this paper, we propose a novel Answer-Supervised Question Reformulation (ASQR) model for enhancing conversational machine comprehension with reinforcement learning technology. ASQR utilizes a pointer-copy-based question reformulation model as an agent, takes an action to predict the next word, and observes a reward for the whole sentence state after generating the end-of-sequence token. The experimental results on QuAC dataset prove that our ASQR model is more effective in conversational machine comprehension. Moreover, pretraining is essential in reinforcement learning models, so we provide a high-quality annotated dataset for question reformulation by sampling a part of QuAC dataset.

pdf bib
Incremental Transformer with Deliberation Decoder for Document Grounded Conversations
Zekang Li | Cheng Niu | Fandong Meng | Yang Feng | Qian Li | Jie Zhou
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Document Grounded Conversations is a task to generate dialogue responses when chatting about the content of a given document. Obviously, document knowledge plays a critical role in Document Grounded Conversations, while existing dialogue models do not exploit this kind of knowledge effectively enough. In this paper, we propose a novel Transformer-based architecture for multi-turn document grounded conversations. In particular, we devise an Incremental Transformer to encode multi-turn utterances along with knowledge in related documents. Motivated by the human cognitive process, we design a two-pass decoder (Deliberation Decoder) to improve context coherence and knowledge correctness. Our empirical study on a real-world Document Grounded Dataset proves that responses generated by our model significantly outperform competitive baselines on both context coherence and knowledge relevance.

pdf bib
Improving Multi-turn Dialogue Modelling with Utterance ReWriter
Hui Su | Xiaoyu Shen | Rongzhi Zhang | Fei Sun | Pengwei Hu | Cheng Niu | Jie Zhou
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Recent research has achieved impressive results in single-turn dialogue modelling. In the multi-turn setting, however, current models are still far from satisfactory. One major challenge is the frequently occurred coreference and information omission in our daily conversation, making it hard for machines to understand the real intention. In this paper, we propose rewriting the human utterance as a pre-process to help multi-turn dialgoue modelling. Each utterance is first rewritten to recover all coreferred and omitted information. The next processing steps are then performed based on the rewritten utterance. To properly train the utterance rewriter, we collect a new dataset with human annotations and introduce a Transformer-based utterance rewriting architecture using the pointer network. We show the proposed architecture achieves remarkably good performance on the utterance rewriting task. The trained utterance rewriter can be easily integrated into online chatbots and brings general improvement over different domains.

pdf bib
Rhetorically Controlled Encoder-Decoder for Modern Chinese Poetry Generation
Zhiqiang Liu | Zuohui Fu | Jie Cao | Gerard de Melo | Yik-Cheung Tam | Cheng Niu | Jie Zhou
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Rhetoric is a vital element in modern poetry, and plays an essential role in improving its aesthetics. However, to date, it has not been considered in research on automatic poetry generation. In this paper, we propose a rhetorically controlled encoder-decoder for modern Chinese poetry generation. Our model relies on a continuous latent variable as a rhetoric controller to capture various rhetorical patterns in an encoder, and then incorporates rhetoric-based mixtures while generating modern Chinese poetry. For metaphor and personification, an automated evaluation shows that our model outperforms state-of-the-art baselines by a substantial margin, while human evaluation shows that our model generates better poems than baseline methods in terms of fluency, coherence, meaningfulness, and rhetorical aesthetics.

2008

pdf bib
Combining Multiple Resources to Improve SMT-based Paraphrasing Model
Shiqi Zhao | Cheng Niu | Ming Zhou | Ting Liu | Sheng Li
Proceedings of ACL-08: HLT

2006

pdf bib
A DOM Tree Alignment Model for Mining Parallel Data from the Web
Lei Shi | Cheng Niu | Ming Zhou | Jianfeng Gao
Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics

2005

pdf bib
Word Independent Context Pair Classification Model for Word Sense Disambiguation
Cheng Niu | Wei Li | Rohini K. Srihari | Huifeng Li
Proceedings of the Ninth Conference on Computational Natural Language Learning (CoNLL-2005)

2004

pdf bib
Weakly Supervised Learning for Cross-document Person Name Disambiguation Supported by Information Extraction
Cheng Niu | Wei Li | Rohini K. Srihari
Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04)

pdf bib
Context clustering for Word Sense Disambiguation based on modeling pairwise context similarities
Cheng Niu | Wei Li | Rohini K. Srihari | Huifeng Li | Laurie Crist
Proceedings of SENSEVAL-3, the Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text

2003

pdf bib
InfoXtract location normalization: a hybrid approach to geographic references in information extraction
Huifeng Li | K. Rohini Srihari | Cheng Niu | Wei Li
Proceedings of the HLT-NAACL 2003 Workshop on Analysis of Geographic References

pdf bib
InfoXtract: A Customizable Intermediate Level Information Extraction Engine
Rohini K. Srihari | Wei Li | Cheng Niu | Thomas Cornell
Proceedings of the HLT-NAACL 2003 Workshop on Software Engineering and Architecture of Language Technology Systems (SEALTS)

pdf bib
Question Answering on a Case Insensitive Corpus
Wei Li | Rohini Srihari | Cheng Niu | Xiaoge Li
Proceedings of the ACL 2003 Workshop on Multilingual Summarization and Question Answering

pdf bib
A Bootstrapping Approach to Named Entity Classification Using Successive Learners
Cheng Niu | Wei Li | Jihong Ding | Rohini Srihari
Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics

pdf bib
An Expert Lexicon Approach to Identifying English Phrasal Verbs
Wei Li | Xiuhong Zhang | Cheng Niu | Yuankai Jiang | Rohini K. Srihari
Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics

pdf bib
Bootstrapping for Named Entity Tagging Using Concept-based Seeds
Cheng Niu | Wei Li | Jihong Ding | Rohini K. Srihari
Companion Volume of the Proceedings of HLT-NAACL 2003 - Short Papers

2002

pdf bib
Extracting Exact Answers to Questions Based on Structural Links
Wei Li | Rohini K. Srihari | Xiaoge Li | M. Srikanth | Xiuhong Zhang | Cheng Niu
COLING-02: Multilingual Summarization and Question Answering

pdf bib
Location Normalization for Information Extraction
Huifeng Li | Rohini K. Srihari | Cheng Niu | Wei Li
COLING 2002: The 19th International Conference on Computational Linguistics