Jun Xu


pdf bib
Conversational Graph Grounded Policy Learning for Open-Domain Conversation Generation
Jun Xu | Haifeng Wang | Zheng-Yu Niu | Hua Wu | Wanxiang Che | Ting Liu
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

To address the challenge of policy learning in open-domain multi-turn conversation, we propose to represent prior information about dialog transitions as a graph and learn a graph grounded dialog policy, aimed at fostering a more coherent and controllable dialog. To this end, we first construct a conversational graph (CG) from dialog corpora, in which there are vertices to represent “what to say” and “how to say”, and edges to represent natural transition between a message (the last utterance in a dialog context) and its response. We then present a novel CG grounded policy learning framework that conducts dialog flow planning by graph traversal, which learns to identify a what-vertex and a how-vertex from the CG at each turn to guide response generation. In this way, we effectively leverage the CG to facilitate policy learning as follows: (1) it enables more effective long-term reward design, (2) it provides high-quality candidate actions, and (3) it gives us more control over the policy. Results on two benchmark corpora demonstrate the effectiveness of this framework.

pdf bib
Transformer-GCRF: Recovering Chinese Dropped Pronouns with General Conditional Random Fields
Jingxuan Yang | Kerui Xu | Jun Xu | Si Li | Sheng Gao | Jun Guo | Ji-Rong Wen | Nianwen Xue
Findings of the Association for Computational Linguistics: EMNLP 2020

Pronouns are often dropped in Chinese conversations and recovering the dropped pronouns is important for NLP applications such as Machine Translation. Existing approaches usually formulate this as a sequence labeling task of predicting whether there is a dropped pronoun before each token and its type. Each utterance is considered to be a sequence and labeled independently. Although these approaches have shown promise, labeling each utterance independently ignores the dependencies between pronouns in neighboring utterances. Modeling these dependencies is critical to improving the performance of dropped pronoun recovery. In this paper, we present a novel framework that combines the strength of Transformer network with General Conditional Random Fields (GCRF) to model the dependencies between pronouns in neighboring utterances. Results on three Chinese conversation datasets show that the Transformer-GCRF model outperforms the state-of-the-art dropped pronoun recovery models. Exploratory analysis also demonstrates that the GCRF did help to capture the dependencies between pronouns in neighboring utterances, thus contributes to the performance improvements.

pdf bib
Wasserstein Distance Regularized Sequence Representation for Text Matching in Asymmetrical Domains
Weijie Yu | Chen Xu | Jun Xu | Liang Pang | Xiaopeng Gao | Xiaozhao Wang | Ji-Rong Wen
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

One approach to matching texts from asymmetrical domains is projecting the input sequences into a common semantic space as feature vectors upon which the matching function can be readily defined and learned. In real-world matching practices, it is often observed that with the training goes on, the feature vectors projected from different domains tend to be indistinguishable. The phenomenon, however, is often overlooked in existing matching models. As a result, the feature vectors are constructed without any regularization, which inevitably increases the difficulty of learning the downstream matching functions. In this paper, we propose a novel match method tailored for text matching in asymmetrical domains, called WD-Match. In WD-Match, a Wasserstein distance-based regularizer is defined to regularize the features vectors projected from different domains. As a result, the method enforces the feature projection function to generate vectors such that those correspond to different domains cannot be easily discriminated. The training process of WD-Match amounts to a game that minimizes the matching loss regularized by the Wasserstein distance. WD-Match can be used to improve different text matching methods, by using the method as its underlying matching model. Four popular text matching methods have been exploited in the paper. Experimental results based on four publicly available benchmarks showed that WD-Match consistently outperformed the underlying methods and the baselines.


pdf bib
Learning to Control the Specificity in Neural Response Generation
Ruqing Zhang | Jiafeng Guo | Yixing Fan | Yanyan Lan | Jun Xu | Xueqi Cheng
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

In conversation, a general response (e.g., “I don’t know”) could correspond to a large variety of input utterances. Previous generative conversational models usually employ a single model to learn the relationship between different utterance-response pairs, thus tend to favor general and trivial responses which appear frequently. To address this problem, we propose a novel controlled response generation mechanism to handle different utterance-response relationships in terms of specificity. Specifically, we introduce an explicit specificity control variable into a sequence-to-sequence model, which interacts with the usage representation of words through a Gaussian Kernel layer, to guide the model to generate responses at different specificity levels. We describe two ways to acquire distant labels for the specificity control variable in learning. Empirical studies show that our model can significantly outperform the state-of-the-art response generation models under both automatic and human evaluations.

pdf bib
Tailored Sequence to Sequence Models to Different Conversation Scenarios
Hainan Zhang | Yanyan Lan | Jiafeng Guo | Jun Xu | Xueqi Cheng
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Sequence to sequence (Seq2Seq) models have been widely used for response generation in the area of conversation. However, the requirements for different conversation scenarios are distinct. For example, customer service requires the generated responses to be specific and accurate, while chatbot prefers diverse responses so as to attract different users. The current Seq2Seq model fails to meet these diverse requirements, by using a general average likelihood as the optimization criteria. As a result, it usually generates safe and commonplace responses, such as ‘I don’t know’. In this paper, we propose two tailored optimization criteria for Seq2Seq to different conversation scenarios, i.e., the maximum generated likelihood for specific-requirement scenario, and the conditional value-at-risk for diverse-requirement scenario. Experimental results on the Ubuntu dialogue corpus (Ubuntu service scenario) and Chinese Weibo dataset (social chatbot scenario) show that our proposed models not only satisfies diverse requirements for different scenarios, but also yields better performances against traditional Seq2Seq models in terms of both metric-based and human evaluations.


pdf bib
A Unified Architecture for Semantic Role Labeling and Relation Classification
Jiang Guo | Wanxiang Che | Haifeng Wang | Ting Liu | Jun Xu
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

This paper describes a unified neural architecture for identifying and classifying multi-typed semantic relations between words in a sentence. We investigate two typical and well-studied tasks: semantic role labeling (SRL) which identifies the relations between predicates and arguments, and relation classification (RC) which focuses on the relation between two entities or nominals. While mostly studied separately in prior work, we show that the two tasks can be effectively connected and modeled using a general architecture. Experiments on CoNLL-2009 benchmark datasets show that our SRL models significantly outperform state-of-the-art approaches. Our RC models also yield competitive performance with the best published records. Furthermore, we show that the two tasks can be trained jointly with multi-task learning, resulting in additive significant improvements for SRL.

pdf bib
UTHealth at SemEval-2016 Task 12: an End-to-End System for Temporal Information Extraction from Clinical Notes
Hee-Jin Lee | Hua Xu | Jingqi Wang | Yaoyun Zhang | Sungrim Moon | Jun Xu | Yonghui Wu
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)


pdf bib
Clinical Abbreviation Disambiguation Using Neural Word Embeddings
Yonghui Wu | Jun Xu | Yaoyun Zhang | Hua Xu
Proceedings of BioNLP 15

pdf bib
Learning Word Representations by Jointly Modeling Syntagmatic and Paradigmatic Relations
Fei Sun | Jiafeng Guo | Yanyan Lan | Jun Xu | Xueqi Cheng
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

pdf bib
HITSZ-ICRC: Exploiting Classification Approach for Answer Selection in Community Question Answering
Yongshuai Hou | Cong Tan | Xiaolong Wang | Yaoyun Zhang | Jun Xu | Qingcai Chen
Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)

pdf bib
UTH-CCB: The Participation of the SemEval 2015 Challenge – Task 14
Jun Xu | Yaoyun Zhang | Jingqi Wang | Yonghui Wu | Min Jiang | Ergin Soysal | Hua Xu
Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)


pdf bib
Cross-lingual Opinion Analysis via Negative Transfer Detection
Lin Gui | Ruifeng Xu | Qin Lu | Jun Xu | Jian Xu | Bin Liu | Xiaolong Wang
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)


pdf bib
Incorporating Rule-based and Statistic-based Techniques for Coreference Resolution
Ruifeng Xu | Jun Xu | Jie Liu | Chengxiang Liu | Chengtian Zou | Lin Gui | Yanzhen Zheng | Peng Qu
Joint Conference on EMNLP and CoNLL - Shared Task


pdf bib
Instance Level Transfer Learning for Cross Lingual Opinion Analysis
Ruifeng Xu | Jun Xu | Xiaolong Wang
Proceedings of the 2nd Workshop on Computational Approaches to Subjectivity and Sentiment Analysis (WASSA 2.011)

pdf bib
Diversifying Information Needs in Results of Question Retrieval
Yaoyun Zhang | Xiaolong Wang | Xuan Wang | Ruifeng Xu | Jun Xu | ShiXi Fan
Proceedings of 5th International Joint Conference on Natural Language Processing


pdf bib
Combine Person Name and Person Identity Recognition and Document Clustering for Chinese Person Name Disambiguation
Ruifeng Xu | Jun Xu | Xiangying Dai | Chunyu Kit
CIPS-SIGHAN Joint Conference on Chinese Language Processing

pdf bib
HITSZ_CITYU: Combine Collocation, Context Words and Neighboring Sentence Sentiment in Sentiment Adjectives Disambiguation
Ruifeng Xu | Jun Xu | Chunyu Kit
Proceedings of the 5th International Workshop on Semantic Evaluation