Liang Pang


2020

pdf bib
Beyond Language: Learning Commonsense from Images for Reasoning
Wanqing Cui | Yanyan Lan | Liang Pang | Jiafeng Guo | Xueqi Cheng
Findings of the Association for Computational Linguistics: EMNLP 2020

This paper proposes a novel approach to learn commonsense from images, instead of limited raw texts or costly constructed knowledge bases, for the commonsense reasoning problem in NLP. Our motivation comes from the fact that an image is worth a thousand words, where richer scene information could be leveraged to help distill the commonsense knowledge, which is often hidden in languages. Our approach, namely Loire, consists of two stages. In the first stage, a bi-modal sequence-to-sequence approach is utilized to conduct the scene layout generation task, based on a text representation model ViBERT. In this way, the required visual scene knowledge, such as spatial relations, will be encoded in ViBERT by the supervised learning process with some bi-modal data like COCO. Then ViBERT is concatenated with a pre-trained language model to perform the downstream commonsense reasoning tasks. Experimental results on two commonsense reasoning problems, i.e.commonsense question answering and pronoun resolution, demonstrate that Loire outperforms traditional language-based methods. We also give some case studies to show what knowledge is learned from images and explain how the generated scene layout helps the commonsense reasoning process.

pdf bib
METNet: A Mutual Enhanced Transformation Network for Aspect-based Sentiment Analysis
Bin Jiang | Jing Hou | Wanyue Zhou | Chao Yang | Shihan Wang | Liang Pang
Proceedings of the 28th International Conference on Computational Linguistics

Aspect-based sentiment analysis (ABSA) aims to determine the sentiment polarity of each specific aspect in a given sentence. Existing researches have realized the importance of the aspect for the ABSA task and have derived many interactive learning methods that model context based on specific aspect. However, current interaction mechanisms are ill-equipped to learn complex sentences with multiple aspects, and these methods underestimate the representation learning of the aspect. In order to solve the two problems, we propose a mutual enhanced transformation network (METNet) for the ABSA task. First, the aspect enhancement module in METNet improves the representation learning of the aspect with contextual semantic features, which gives the aspect more abundant information. Second, METNet designs and implements a hierarchical structure, which enhances the representations of aspect and context iteratively. Experimental results on SemEval 2014 Datasets demonstrate the effectiveness of METNet, and we further prove that METNet is outstanding in multi-aspect scenarios.

pdf bib
PEDNet: A Persona Enhanced Dual Alternating Learning Network for Conversational Response Generation
Bin Jiang | Wanyue Zhou | Jingxu Yang | Chao Yang | Shihan Wang | Liang Pang
Proceedings of the 28th International Conference on Computational Linguistics

Endowing a chatbot with a personality is essential to deliver more realistic conversations. Various persona-based dialogue models have been proposed to generate personalized and diverse responses by utilizing predefined persona information. However, generating personalized responses is still a challenging task since the leverage of predefined persona information is often insufficient. To alleviate this problem, we propose a novel Persona Enhanced Dual Alternating Learning Network (PEDNet) aiming at producing more personalized responses in various open-domain conversation scenarios. PEDNet consists of a Context-Dominate Network (CDNet) and a Persona-Dominate Network (PDNet), which are built upon a common encoder-decoder backbone. CDNet learns to select a proper persona as well as ensure the contextual relevance of the predicted response, while PDNet learns to enhance the utilization of persona information when generating the response by weakening the disturbance of specific content in the conversation context. CDNet and PDNet are trained alternately using a multi-task training approach to equip PEDNet with the both capabilities they have learned. Both automatic and human evaluations on a newly released dialogue dataset Persona-chat demonstrate that our method could deliver more personalized responses than baseline methods.

pdf bib
Wasserstein Distance Regularized Sequence Representation for Text Matching in Asymmetrical Domains
Weijie Yu | Chen Xu | Jun Xu | Liang Pang | Xiaopeng Gao | Xiaozhao Wang | Ji-Rong Wen
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

One approach to matching texts from asymmetrical domains is projecting the input sequences into a common semantic space as feature vectors upon which the matching function can be readily defined and learned. In real-world matching practices, it is often observed that with the training goes on, the feature vectors projected from different domains tend to be indistinguishable. The phenomenon, however, is often overlooked in existing matching models. As a result, the feature vectors are constructed without any regularization, which inevitably increases the difficulty of learning the downstream matching functions. In this paper, we propose a novel match method tailored for text matching in asymmetrical domains, called WD-Match. In WD-Match, a Wasserstein distance-based regularizer is defined to regularize the features vectors projected from different domains. As a result, the method enforces the feature projection function to generate vectors such that those correspond to different domains cannot be easily discriminated. The training process of WD-Match amounts to a game that minimizes the matching loss regularized by the Wasserstein distance. WD-Match can be used to improve different text matching methods, by using the method as its underlying matching model. Four popular text matching methods have been exploited in the paper. Experimental results based on four publicly available benchmarks showed that WD-Match consistently outperformed the underlying methods and the baselines.

2019

pdf bib
ReCoSa: Detecting the Relevant Contexts with Self-Attention for Multi-turn Dialogue Generation
Hainan Zhang | Yanyan Lan | Liang Pang | Jiafeng Guo | Xueqi Cheng
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

In multi-turn dialogue generation, response is usually related with only a few contexts. Therefore, an ideal model should be able to detect these relevant contexts and produce a suitable response accordingly. However, the widely used hierarchical recurrent encoder-decoder models just treat all the contexts indiscriminately, which may hurt the following response generation process. Some researchers try to use the cosine similarity or the traditional attention mechanism to find the relevant contexts, but they suffer from either insufficient relevance assumption or position bias problem. In this paper, we propose a new model, named ReCoSa, to tackle this problem. Firstly, a word level LSTM encoder is conducted to obtain the initial representation of each context. Then, the self-attention mechanism is utilized to update both the context and masked response representation. Finally, the attention weights between each context and response representations are computed and used in the further decoding process. Experimental results on both Chinese customer services dataset and English Ubuntu dialogue dataset show that ReCoSa significantly outperforms baseline models, in terms of both metric-based and human evaluations. Further analysis on attention shows that the detected relevant contexts by ReCoSa are highly coherent with human’s understanding, validating the correctness and interpretability of ReCoSa.