Lizhen Qu


2020

pdf bib
Context Dependent Semantic Parsing: A Survey
Zhuang Li | Lizhen Qu | Gholamreza Haffari
Proceedings of the 28th International Conference on Computational Linguistics

Semantic parsing is the task of translating natural language utterances into machine-readable meaning representations. Currently, most semantic parsing methods are not able to utilize the contextual information (e.g. dialogue and comments history), which has a great potential to boost the semantic parsing systems. To address this issue, context dependent semantic parsing has recently drawn a lot of attention. In this survey, we investigate progress on the methods for the context dependent semantic parsing, together with the current datasets and tasks. We then point out open problems and challenges for future research in this area.

pdf bib
CosMo: Conditional Seq2Seq-based Mixture Model for Zero-Shot Commonsense Question Answering
Farhad Moghimifar | Lizhen Qu | Yue Zhuo | Mahsa Baktashmotlagh | Gholamreza Haffari
Proceedings of the 28th International Conference on Computational Linguistics

Commonsense reasoning refers to the ability of evaluating a social situation and acting accordingly. Identification of the implicit causes and effects of a social context is the driving capability which can enable machines to perform commonsense reasoning. The dynamic world of social interactions requires context-dependent on-demand systems to infer such underlying information. However, current approaches in this realm lack the ability to perform commonsense reasoning upon facing an unseen situation, mostly due to incapability of identifying a diverse range of implicit social relations. Hence they fail to estimate the correct reasoning path. In this paper, we present Conditional Seq2Seq-based Mixture model (CosMo), which provides us with the capabilities of dynamic and diverse content generation. We use CosMo to generate context-dependent clauses, which form a dynamic Knowledge Graph (KG) on-the-fly for commonsense reasoning. To show the adaptability of our model to context-dependant knowledge generation, we address the task of zero-shot commonsense question answering. The empirical results indicate an improvement of up to +5.2% over the state-of-the-art models.

pdf bib
Personal Information Leakage Detection in Conversations
Qiongkai Xu | Lizhen Qu | Zeyu Gao | Gholamreza Haffari
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

The global market size of conversational assistants (chatbots) is expected to grow to USD 9.4 billion by 2024, according to MarketsandMarkets. Despite the wide use of chatbots, leakage of personal information through chatbots poses serious privacy concerns for their users. In this work, we propose to protect personal information by warning users of detected suspicious sentences generated by conversational assistants. The detection task is formulated as an alignment optimization problem and a new dataset PERSONA-LEAKAGE is collected for evaluation. In this paper, we propose two novel constrained alignment models, which consistently outperform baseline methods on Moreover, we conduct analysis on the behavior of recently proposed personalized chit-chat dialogue systems. The empirical results show that those systems suffer more from personal information disclosure than the widely used Seq2Seq model and the language model. In those cases, a significant number of information leaking utterances can be detected by our models with high precision.

2019

pdf bib
ALTER: Auxiliary Text Rewriting Tool for Natural Language Generation
Qiongkai Xu | Chenchen Xu | Lizhen Qu
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP): System Demonstrations

In this paper, we describe ALTER, an auxiliary text rewriting tool that facilitates the rewriting process for natural language generation tasks, such as paraphrasing, text simplification, fairness-aware text rewriting, and text style transfer. Our tool is characterized by two features, i) recording of word-level revision histories and ii) flexible auxiliary edit support and feedback to annotators. The text rewriting assist and traceable rewriting history are potentially beneficial to the future research of natural language generation.

pdf bib
Privacy-Aware Text Rewriting
Qiongkai Xu | Lizhen Qu | Chenchen Xu | Ran Cui
Proceedings of the 12th International Conference on Natural Language Generation

Biased decisions made by automatic systems have led to growing concerns in research communities. Recent work from the NLP community focuses on building systems that make fair decisions based on text. Instead of relying on unknown decision systems or human decision-makers, we argue that a better way to protect data providers is to remove the trails of sensitive information before publishing the data. In light of this, we propose a new privacy-aware text rewriting task and explore two privacy-aware back-translation methods for the task, based on adversarial training and approximate fairness risk. Our extensive experiments on three real-world datasets with varying demographical attributes show that our methods are effective in obfuscating sensitive attributes. We have also observed that the fairness risk method retains better semantics and fluency, while the adversarial training method tends to leak less sensitive information.

2017

pdf bib
Demographic Inference on Twitter using Recursive Neural Networks
Sunghwan Mac Kim | Qiongkai Xu | Lizhen Qu | Stephen Wan | Cécile Paris
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

In social media, demographic inference is a critical task in order to gain a better understanding of a cohort and to facilitate interacting with one’s audience. Most previous work has made independence assumptions over topological, textual and label information on social networks. In this work, we employ recursive neural networks to break down these independence assumptions to obtain inference about demographic characteristics on Twitter. We show that our model performs better than existing models including the state-of-the-art.

2016

pdf bib
Named Entity Recognition for Novel Types by Transfer Learning
Lizhen Qu | Gabriela Ferraro | Liyuan Zhou | Weiwei Hou | Timothy Baldwin
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf bib
STransE: a novel embedding model of entities and relationships in knowledge bases
Dat Quoc Nguyen | Kairit Sirts | Lizhen Qu | Mark Johnson
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Neighborhood Mixture Model for Knowledge Base Completion
Dat Quoc Nguyen | Kairit Sirts | Lizhen Qu | Mark Johnson
Proceedings of The 20th SIGNLL Conference on Computational Natural Language Learning

pdf bib
Unsupervised Pre-training With Seq2Seq Reconstruction Loss for Deep Relation Extraction Models
Zhuang Li | Lizhen Qu | Qiongkai Xu | Mark Johnson
Proceedings of the Australasian Language Technology Association Workshop 2016

pdf bib
Pairwise FastText Classifier for Entity Disambiguation
Cheng Yu | Bing Chu | Rohit Ram | James Aichinger | Lizhen Qu | Hanna Suominen
Proceedings of the Australasian Language Technology Association Workshop 2016

2015

pdf bib
Big Data Small Data, In Domain Out-of Domain, Known Word Unknown Word: The Impact of Word Representations on Sequence Labelling Tasks
Lizhen Qu | Gabriela Ferraro | Liyuan Zhou | Weiwei Hou | Nathan Schneider | Timothy Baldwin
Proceedings of the Nineteenth Conference on Computational Natural Language Learning

2014

pdf bib
Senti-LSSVM: Sentiment-Oriented Multi-Relation Extraction with Latent Structural SVM
Lizhen Qu | Yi Zhang | Rui Wang | Lili Jiang | Rainer Gemulla | Gerhard Weikum
Transactions of the Association for Computational Linguistics, Volume 2

Extracting instances of sentiment-oriented relations from user-generated web documents is important for online marketing analysis. Unlike previous work, we formulate this extraction task as a structured prediction problem and design the corresponding inference as an integer linear program. Our latent structural SVM based model can learn from training corpora that do not contain explicit annotations of sentiment-bearing expressions, and it can simultaneously recognize instances of both binary (polarity) and ternary (comparative) relations with regard to entity mentions of interest. The empirical evaluation shows that our approach significantly outperforms state-of-the-art systems across domains (cameras and movies) and across genres (reviews and forum posts). The gold standard corpus that we built will also be a valuable resource for the community.

2012

pdf bib
A Weakly Supervised Model for Sentence-Level Semantic Orientation Analysis with Multiple Experts
Lizhen Qu | Rainer Gemulla | Gerhard Weikum
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

2010

pdf bib
The Bag-of-Opinions Method for Review Rating Prediction from Sparse Text Patterns
Lizhen Qu | Georgiana Ifrim | Gerhard Weikum
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)