Dong Wang


2020

pdf bib
Conversational Word Embedding for Retrieval-Based Dialog System
Wentao Ma | Yiming Cui | Ting Liu | Dong Wang | Shijin Wang | Guoping Hu
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Human conversations contain many types of information, e.g., knowledge, common sense, and language habits. In this paper, we propose a conversational word embedding method named PR-Embedding, which utilizes the conversation pairs <post, reply> to learn word embedding. Different from previous works, PR-Embedding uses the vectors from two different semantic spaces to represent the words in post and reply.To catch the information among the pair, we first introduce the word alignment model from statistical machine translation to generate the cross-sentence window, then train the embedding on word-level and sentence-level.We evaluate the method on single-turn and multi-turn response selection tasks for retrieval-based dialog systems.The experiment results show that PR-Embedding can improve the quality of the selected response.

pdf bib
Summarize before Aggregate: A Global-to-local Heterogeneous Graph Inference Network for Conversational Emotion Recognition
Dongming Sheng | Dong Wang | Ying Shen | Haitao Zheng | Haozhuang Liu
Proceedings of the 28th International Conference on Computational Linguistics

Conversational Emotion Recognition (CER) is a crucial task in Natural Language Processing (NLP) with wide applications. Prior works in CER generally focus on modeling emotion influences solely with utterance-level features, with little attention paid on phrase-level semantic connection between utterances. Phrases carry sentiments when they are referred to emotional events under certain topics, providing a global semantic connection between utterances throughout the entire conversation. In this work, we propose a two-stage Summarization and Aggregation Graph Inference Network (SumAggGIN), which seamlessly integrates inference for topic-related emotional phrases and local dependency reasoning over neighbouring utterances in a global-to-local fashion. Topic-related emotional phrases, which constitutes the global topic-related emotional connections, are recognized by our proposed heterogeneous Summarization Graph. Local dependencies, which captures short-term emotional effects between neighbouring utterances, are further injected via an Aggregation Graph to distinguish the subtle differences between utterances containing emotional phrases. The two steps of graph inference are tightly-coupled for a comprehensively understanding of emotional fluctuation. Experimental results on three CER benchmark datasets verify the effectiveness of our proposed model, which outperforms the state-of-the-art approaches.

pdf bib
Integrating User History into Heterogeneous Graph for Dialogue Act Recognition
Dong Wang | Ziran Li | Haitao Zheng | Ying Shen
Proceedings of the 28th International Conference on Computational Linguistics

Dialogue Act Recognition (DAR) is a challenging problem in Natural Language Understanding, which aims to attach Dialogue Act (DA) labels to each utterance in a conversation. However, previous studies cannot fully recognize the specific expressions given by users due to the informality and diversity of natural language expressions. To solve this problem, we propose a Heterogeneous User History (HUH) graph convolution network, which utilizes the user’s historical answers grouped by DA labels as additional clues to recognize the DA label of utterances. To handle the noise caused by introducing the user’s historical answers, we design sets of denoising mechanisms, including a History Selection process, a Similarity Re-weighting process, and an Edge Re-weighting process. We evaluate the proposed method on two benchmark datasets MSDialog and MRDA. The experimental results verify the effectiveness of integrating user’s historical answers, and show that our proposed model outperforms the state-of-the-art methods.

2017

pdf bib
Discourse Mode Identification in Essays
Wei Song | Dong Wang | Ruiji Fu | Lizhen Liu | Ting Liu | Guoping Hu
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Discourse modes play an important role in writing composition and evaluation. This paper presents a study on the manual and automatic identification of narration,exposition, description, argument and emotion expressing sentences in narrative essays. We annotate a corpus to study the characteristics of discourse modes and describe a neural sequence labeling model for identification. Evaluation results show that discourse modes can be identified automatically with an average F1-score of 0.7. We further demonstrate that discourse modes can be used as features that improve automatic essay scoring (AES). The impacts of discourse modes for AES are also discussed.

pdf bib
Flexible and Creative Chinese Poetry Generation Using Neural Memory
Jiyuan Zhang | Yang Feng | Dong Wang | Yang Wang | Andrew Abel | Shiyue Zhang | Andi Zhang
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

It has been shown that Chinese poems can be successfully generated by sequence-to-sequence neural models, particularly with the attention mechanism. A potential problem of this approach, however, is that neural models can only learn abstract rules, while poem generation is a highly creative process that involves not only rules but also innovations for which pure statistical models are not appropriate in principle. This work proposes a memory augmented neural model for Chinese poem generation, where the neural model and the augmented memory work together to balance the requirements of linguistic accordance and aesthetic innovation, leading to innovative generations that are still rule-compliant. In addition, it is found that the memory mechanism provides interesting flexibility that can be used to generate poems with different styles.

pdf bib
Memory-augmented Neural Machine Translation
Yang Feng | Shiyue Zhang | Andi Zhang | Dong Wang | Andrew Abel
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

Neural machine translation (NMT) has achieved notable success in recent times, however it is also widely recognized that this approach has limitations with handling infrequent words and word pairs. This paper presents a novel memory-augmented NMT (M-NMT) architecture, which stores knowledge about how words (usually infrequently encountered ones) should be translated in a memory and then utilizes them to assist the neural model. We use this memory mechanism to combine the knowledge learned from a conventional statistical machine translation system and the rules learned by an NMT system, and also propose a solution for out-of-vocabulary (OOV) words based on this framework. Our experiments on two Chinese-English translation tasks demonstrated that the M-NMT architecture outperformed the NMT baseline by 9.0 and 2.7 BLEU points on the two tasks, respectively. Additionally, we found this architecture resulted in a much more effective OOV treatment compared to competitive methods.

2015

pdf bib
Stochastic Top-k ListNet
Tianyi Luo | Dong Wang | Rong Liu | Yiqiao Pan
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
Joint Semantic Relevance Learning with Text Data and Graph Knowledge
Dongxu Zhang | Bin Yuan | Dong Wang | Rong Liu
Proceedings of the 3rd Workshop on Continuous Vector Space Models and their Compositionality

pdf bib
Normalized Word Embedding and Orthogonal Transform for Bilingual Word Translation
Chao Xing | Dong Wang | Chao Liu | Yiye Lin
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

2012

pdf bib
A Two-step Approach to Sentence Compression of Spoken Utterances
Dong Wang | Xian Qian | Yang Liu
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
Effort of Genre Variation and Prediction of System Performance
Dong Wang | Fei Xia
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

Domain adaptation is an important task in order for NLP systems to work well in real applications. There has been extensive research on this topic. In this paper, we address two issues that are related to domain adaptation. The first question is how much genre variation will affect NLP systems' performance. We investigate the effect of genre variation on the performance of three NLP tools, namely, word segmenter, POS tagger, and parser. We choose the Chinese Penn Treebank (CTB) as our corpus. The second question is how one can estimate NLP systems' performance when gold standard on the test data does not exist. To answer the question, we extend the prediction model in (Ravi et al., 2008) to provide prediction for word segmentation and POS tagging as well. Our experiments show that the predicted scores are close to the real scores when tested on the CTB data.

pdf bib
Tweet Ranking Based on Heterogeneous Networks
Hongzhao Huang | Arkaitz Zubiaga | Heng Ji | Hongbo Deng | Dong Wang | Hieu Le | Tarek Abdelzaher | Jiawei Han | Alice Leung | John Hancock | Clare Voss
Proceedings of COLING 2012

2011

pdf bib
A Cross-corpus Study of Unsupervised Subjectivity Identification based on Calibrated EM
Dong Wang | Yang Liu
Proceedings of the 2nd Workshop on Computational Approaches to Subjectivity and Sentiment Analysis (WASSA 2.011)

pdf bib
A Pilot Study of Opinion Summarization in Conversations
Dong Wang | Yang Liu
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

2010

pdf bib
Improving Blog Polarity Classification via Topic Analysis and Adaptive Methods
Feifan Liu | Dong Wang | Bin Li | Yang Liu
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics