ChengXiang Zhai

Also published as: Chengxiang Zhai


2020

pdf bib
Multi-task Learning for Multilingual Neural Machine Translation
Yiren Wang | ChengXiang Zhai | Hany Hassan
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

While monolingual data has been shown to be useful in improving bilingual neural machine translation (NMT), effectively and efficiently leveraging monolingual data for Multilingual NMT (MNMT) systems is a less explored area. In this work, we propose a multi-task learning (MTL) framework that jointly trains the model with the translation task on bitext data and two denoising tasks on the monolingual data. We conduct extensive empirical studies on MNMT systems with 10 language pairs from WMT datasets. We show that the proposed approach can effectively improve the translation quality for both high-resource and low-resource languages with large margin, achieving significantly better results than the individual bilingual models. We also demonstrate the efficacy of the proposed approach in the zero-shot setup for language pairs without bitext training data. Furthermore, we show the effectiveness of MTL over pre-training approaches for both NMT and cross-lingual transfer learning NLU tasks; the proposed approach outperforms massive scale models trained on single task.

2019

pdf bib
TILM: Neural Language Models with Evolving Topical Influence
Shubhra Kanti Karmaker Santu | Kalyan Veeramachaneni | Chengxiang Zhai
Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL)

Content of text data are often influenced by contextual factors which often evolve over time (e.g., content of social media are often influenced by topics covered in the major news streams). Existing language models do not consider the influence of such related evolving topics, and thus are not optimal. In this paper, we propose to incorporate such topical-influence into a language model to both improve its accuracy and enable cross-stream analysis of topical influences. Specifically, we propose a novel language model called Topical Influence Language Model (TILM), which is a novel extension of a neural language model to capture the influences on the contents in one text stream by the evolving topics in another related (or possibly same) text stream. Experimental results on six different text stream data comprised of conference paper titles show that the incorporation of evolving topical influence into a language model is beneficial and TILM outperforms multiple baselines in a challenging task of text forecasting. In addition to serving as a language model, TILM further enables interesting analysis of topical influence among multiple text streams.

2017

pdf bib
Identifying Humor in Reviews using Background Text Sources
Alex Morales | Chengxiang Zhai
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

We study the problem of automatically identifying humorous text from a new kind of text data, i.e., online reviews. We propose a generative language model, based on the theory of incongruity, to model humorous text, which allows us to leverage background text sources, such as Wikipedia entry descriptions, and enables construction of multiple features for identifying humorous reviews. Evaluation of these features using supervised learning for classifying reviews into humorous and non-humorous reviews shows that the features constructed based on the proposed generative model are much more effective than the major features proposed in the existing literature, allowing us to achieve almost 86% accuracy. These humorous review predictions can also supply good indicators for identifying helpful reviews.

2016

pdf bib
MeTA: A Unified Toolkit for Text Retrieval and Analysis
Sean Massung | Chase Geigle | ChengXiang Zhai
Proceedings of ACL-2016 System Demonstrations

2012

pdf bib
A Discriminative Model for Query Spelling Correction with Latent Structural SVM
Huizhong Duan | Yanen Li | ChengXiang Zhai | Dan Roth
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

2011

pdf bib
Structural Topic Model for Latent Topical Structure Analysis
Hongning Wang | Duo Zhang | ChengXiang Zhai
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

2010

pdf bib
Opinosis: A Graph Based Approach to Abstractive Summarization of Highly Redundant Opinions
Kavita Ganesan | ChengXiang Zhai | Jiawei Han
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)

pdf bib
Exploiting Structured Ontology to Organize Scattered Online Opinions
Yue Lu | Huizhong Duan | Hongning Wang | ChengXiang Zhai
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)

pdf bib
Shallow Information Extraction from Medical Forum Data
Parikshit Sondhi | Manish Gupta | ChengXiang Zhai | Julia Hockenmaier
Coling 2010: Posters

pdf bib
Cross-Lingual Latent Topic Extraction
Duo Zhang | Qiaozhu Mei | ChengXiang Zhai
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics

pdf bib
Summarizing Contrastive Viewpoints in Opinionated Text
Michael Paul | ChengXiang Zhai | Roxana Girju
Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing

2008

pdf bib
Generating Impact-Based Summaries for Scientific Literature
Qiaozhu Mei | ChengXiang Zhai
Proceedings of ACL-08: HLT

2007

pdf bib
Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference
Candace Sidner | Tanja Schultz | Matthew Stone | ChengXiang Zhai
Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference

pdf bib
A Systematic Exploration of the Feature Space for Relation Extraction
Jing Jiang | ChengXiang Zhai
Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference

pdf bib
Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers
Candace Sidner | Tanja Schultz | Matthew Stone | ChengXiang Zhai
Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers

pdf bib
Statistical Language Models for Information Retrieval
ChengXiang Zhai
Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Tutorial Abstracts

pdf bib
Instance Weighting for Domain Adaptation in NLP
Jing Jiang | ChengXiang Zhai
Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics

2006

pdf bib
Named Entity Transliteration with Comparable Corpora
Richard Sproat | Tao Tao | ChengXiang Zhai
Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics

pdf bib
Unsupervised Named Entity Transliteration Using Temporal and Phonetic Correlation
Tao Tao | Su-Youn Yoon | Andrew Fister | Richard Sproat | ChengXiang Zhai
Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing

pdf bib
Exploiting Domain Structure for Named Entity Recognition
Jing Jiang | ChengXiang Zhai
Proceedings of the Human Language Technology Conference of the NAACL, Main Conference

pdf bib
Language Model Information Retrieval with Document Expansion
Tao Tao | Xuanhui Wang | Qiaozhu Mei | ChengXiang Zhai
Proceedings of the Human Language Technology Conference of the NAACL, Main Conference

1997

pdf bib
Fast Statistical Parsing of Noun Phrases for Document Indexing
Chengxiang Zhai
Fifth Conference on Applied Natural Language Processing

1996

pdf bib
Noun Phrase Analysis in Large Unrestricted Text for Information Retrieval
David A. Evans | Chengxiang Zhai
34th Annual Meeting of the Association for Computational Linguistics