Bowen Zhou


2020

pdf bib
Self-Attention Guided Copy Mechanism for Abstractive Summarization
Song Xu | Haoran Li | Peng Yuan | Youzheng Wu | Xiaodong He | Bowen Zhou
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Copy module has been widely equipped in the recent abstractive summarization models, which facilitates the decoder to extract words from the source into the summary. Generally, the encoder-decoder attention is served as the copy distribution, while how to guarantee that important words in the source are copied remains a challenge. In this work, we propose a Transformer-based model to enhance the copy mechanism. Specifically, we identify the importance of each source word based on the degree centrality with a directed graph built by the self-attention layer in the Transformer. We use the centrality of each source word to guide the copy process explicitly. Experimental results show that the self-attention graph provides useful guidance for the copy distribution. Our proposed models significantly outperform the baseline methods on the CNN/Daily Mail dataset and the Gigaword dataset.

pdf bib
Orthogonal Relation Transforms with Graph Context Modeling for Knowledge Graph Embedding
Yun Tang | Jing Huang | Guangtao Wang | Xiaodong He | Bowen Zhou
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Distance-based knowledge graph embeddings have shown substantial improvement on the knowledge graph link prediction task, from TransE to the latest state-of-the-art RotatE. However, complex relations such as N-to-1, 1-to-N and N-to-N still remain challenging to predict. In this work, we propose a novel distance-based approach for knowledge graph link prediction. First, we extend the RotatE from 2D complex domain to high dimensional space with orthogonal transforms to model relations. The orthogonal transform embedding for relations keeps the capability for modeling symmetric/anti-symmetric, inverse and compositional relations while achieves better modeling capacity. Second, the graph context is integrated into distance scoring functions directly. Specifically, graph context is explicitly modeled via two directed context representations. Each node embedding in knowledge graph is augmented with two context representations, which are computed from the neighboring outgoing and incoming nodes/edges respectively. The proposed approach improves prediction accuracy on the difficult N-to-1, 1-to-N and N-to-N cases. Our experimental results show that it achieves state-of-the-art results on two common benchmarks FB15k-237 and WNRR-18, especially on FB15k-237 which has many high in-degree nodes.

pdf bib
The JDDC Corpus: A Large-Scale Multi-Turn Chinese Dialogue Dataset for E-commerce Customer Service
Meng Chen | Ruixue Liu | Lei Shen | Shaozu Yuan | Jingyan Zhou | Youzheng Wu | Xiaodong He | Bowen Zhou
Proceedings of the 12th Language Resources and Evaluation Conference

Human conversations are complicated and building a human-like dialogue agent is an extremely challenging task. With the rapid development of deep learning techniques, data-driven models become more and more prevalent which need a huge amount of real conversation data. In this paper, we construct a large-scale real scenario Chinese E-commerce conversation corpus, JDDC, with more than 1 million multi-turn dialogues, 20 million utterances, and 150 million words. The dataset reflects several characteristics of human-human conversations, e.g., goal-driven, and long-term dependency among the context. It also covers various dialogue types including task-oriented, chitchat and question-answering. Extra intent information and three well-annotated challenge sets are also provided. Then, we evaluate several retrieval-based and generative models to provide basic benchmark performance on the JDDC corpus. And we hope JDDC can serve as an effective testbed and benefit the development of fundamental research in dialogue task.

pdf bib
On the Faithfulness for E-commerce Product Summarization
Peng Yuan | Haoran Li | Song Xu | Youzheng Wu | Xiaodong He | Bowen Zhou
Proceedings of the 28th International Conference on Computational Linguistics

In this work, we present a model to generate e-commerce product summaries. The consistency between the generated summary and the product attributes is an essential criterion for the ecommerce product summarization task. To enhance the consistency, first, we encode the product attribute table to guide the process of summary generation. Second, we identify the attribute words from the vocabulary, and we constrain these attribute words can be presented in the summaries only through copying from the source, i.e., the attribute words not in the source cannot be generated. We construct a Chinese e-commerce product summarization dataset, and the experimental results on this dataset demonstrate that our models significantly improve the faithfulness.

pdf bib
Learning to Decouple Relations: Few-Shot Relation Classification with Entity-Guided Attention and Confusion-Aware Training
Yingyao Wang | Junwei Bao | Guangyi Liu | Youzheng Wu | Xiaodong He | Bowen Zhou | Tiejun Zhao
Proceedings of the 28th International Conference on Computational Linguistics

This paper aims to enhance the few-shot relation classification especially for sentences that jointly describe multiple relations. Due to the fact that some relations usually keep high co-occurrence in the same context, previous few-shot relation classifiers struggle to distinguish them with few annotated instances. To alleviate the above relation confusion problem, we propose CTEG, a model equipped with two novel mechanisms to learn to decouple these easily-confused relations. On the one hand, an Entity -Guided Attention (EGA) mechanism, which leverages the syntactic relations and relative positions between each word and the specified entity pair, is introduced to guide the attention to filter out information causing confusion. On the other hand, a Confusion-Aware Training (CAT) method is proposed to explicitly learn to distinguish relations by playing a pushing-away game between classifying a sentence into a true relation and its confusing relation. Extensive experiments are conducted on the FewRel dataset, and the results show that our proposed model achieves comparable and even much better results to strong baselines in terms of accuracy. Furthermore, the ablation test and case study verify the effectiveness of our proposed EGA and CAT, especially in addressing the relation confusion problem.

pdf bib
Multimodal Joint Attribute Prediction and Value Extraction for E-commerce Product
Tiangang Zhu | Yue Wang | Haoran Li | Youzheng Wu | Xiaodong He | Bowen Zhou
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Product attribute values are essential in many e-commerce scenarios, such as customer service robots, product recommendations, and product retrieval. While in the real world, the attribute values of a product are usually incomplete and vary over time, which greatly hinders the practical applications. In this paper, we propose a multimodal method to jointly predict product attributes and extract values from textual product descriptions with the help of the product images. We argue that product attributes and values are highly correlated, e.g., it will be easier to extract the values on condition that the product attributes are given. Thus, we jointly model the attribute prediction and value extraction tasks from multiple aspects towards the interactions between attributes and values. Moreover, product images have distinct effects on our tasks for different product attributes and values. Thus, we selectively draw useful visual information from product images to enhance our model. We annotate a multimodal product attribute value dataset that contains 87,194 instances, and the experimental results on this dataset demonstrate that explicitly modeling the relationship between attributes and values facilitates our method to establish the correspondence between them, and selectively utilizing visual product information is necessary for the task. Our code and dataset are available at https://github.com/jd-aig/JAVE.

2019

pdf bib
Relation Module for Non-Answerable Predictions on Reading Comprehension
Kevin Huang | Yun Tang | Jing Huang | Xiaodong He | Bowen Zhou
Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL)

Machine reading comprehension (MRC) has attracted significant amounts of research attention recently, due to an increase of challenging reading comprehension datasets. In this paper, we aim to improve a MRC model’s ability to determine whether a question has an answer in a given context (e.g. the recently proposed SQuAD 2.0 task). The relation module consists of both semantic extraction and relational information. We first extract high level semantics as objects from both question and context with multi-head self-attentive pooling. These semantic objects are then passed to a relation network, which generates relationship scores for each object pair in a sentence. These scores are used to determine whether a question is non-answerable. We test the relation module on the SQuAD 2.0 dataset using both the BiDAF and BERT models as baseline readers. We obtain 1.8% gain of F1 accuracy on top of the BiDAF reader, and 1.0% on top of the BERT base model. These results show the effectiveness of our relation module on MRC.

pdf bib
Multi-hop Reading Comprehension across Multiple Documents by Reasoning over Heterogeneous Graphs
Ming Tu | Guangtao Wang | Jing Huang | Yun Tang | Xiaodong He | Bowen Zhou
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Multi-hop reading comprehension (RC) across documents poses new challenge over single-document RC because it requires reasoning over multiple documents to reach the final answer. In this paper, we propose a new model to tackle the multi-hop RC problem. We introduce a heterogeneous graph with different types of nodes and edges, which is named as Heterogeneous Document-Entity (HDE) graph. The advantage of HDE graph is that it contains different granularity levels of information including candidates, documents and entities in specific document contexts. Our proposed model can do reasoning over the HDE graph with nodes representation initialized with co-attention and self-attention based context encoders. We employ Graph Neural Networks (GNN) based message passing algorithms to accumulate evidences on the proposed HDE graph. Evaluated on the blind test set of the Qangaroo WikiHop data set, our HDE graph based single model delivers competitive result, and the ensemble model achieves the state-of-the-art performance.

2018

pdf bib
Diverse Few-Shot Text Classification with Multiple Metrics
Mo Yu | Xiaoxiao Guo | Jinfeng Yi | Shiyu Chang | Saloni Potdar | Yu Cheng | Gerald Tesauro | Haoyu Wang | Bowen Zhou
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)

We study few-shot learning in natural language domains. Compared to many existing works that apply either metric-based or optimization-based meta-learning to image domain with low inter-task variance, we consider a more realistic setting, where tasks are diverse. However, it imposes tremendous difficulties to existing state-of-the-art metric-based algorithms since a single metric is insufficient to capture complex task variations in natural language domain. To alleviate the problem, we propose an adaptive metric learning approach that automatically determines the best weighted combination from a set of metrics obtained from meta-training tasks for a newly seen few-shot task. Extensive quantitative evaluations on real-world sentiment analysis and dialog intent classification datasets demonstrate that the proposed method performs favorably against state-of-the-art few shot learning algorithms in terms of predictive accuracy. We make our code and data available for further study.

2017

pdf bib
Improved Neural Relation Detection for Knowledge Base Question Answering
Mo Yu | Wenpeng Yin | Kazi Saidul Hasan | Cicero dos Santos | Bing Xiang | Bowen Zhou
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Relation detection is a core component of many NLP applications including Knowledge Base Question Answering (KBQA). In this paper, we propose a hierarchical recurrent neural network enhanced by residual learning which detects KB relations given an input question. Our method uses deep residual bidirectional LSTMs to compare questions and relation names via different levels of abstraction. Additionally, we propose a simple KBQA system that integrates entity linking and our proposed relation detector to make the two components enhance each other. Our experimental results show that our approach not only achieves outstanding relation detection performance, but more importantly, it helps our KBQA system achieve state-of-the-art accuracy for both single-relation (SimpleQuestions) and multi-relation (WebQSP) QA benchmarks.

pdf bib
Group Sparse CNNs for Question Classification with Answer Sets
Mingbo Ma | Liang Huang | Bing Xiang | Bowen Zhou
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Question classification is an important task with wide applications. However, traditional techniques treat questions as general sentences, ignoring the corresponding answer data. In order to consider answer information into question modeling, we first introduce novel group sparse autoencoders which refine question representation by utilizing group information in the answer set. We then propose novel group sparse CNNs which naturally learn question representation with respect to their answers by implanting group sparse autoencoders into traditional CNNs. The proposed model significantly outperform strong baselines on four datasets.

2016

pdf bib
Leveraging Sentence-level Information with Encoder LSTM for Semantic Slot Filling
Gakuto Kurata | Bing Xiang | Bowen Zhou | Mo Yu
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf bib
Improved Neural Network-based Multi-label Classification with Better Initialization Leveraging Label Co-occurrence
Gakuto Kurata | Bing Xiang | Bowen Zhou
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond
Ramesh Nallapati | Bowen Zhou | Cicero dos Santos | Çağlar Gu̇lçehre | Bing Xiang
Proceedings of The 20th SIGNLL Conference on Computational Natural Language Learning

pdf bib
Simple Question Answering by Attentive Convolutional Neural Network
Wenpeng Yin | Mo Yu | Bing Xiang | Bowen Zhou | Hinrich Schütze
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

This work focuses on answering single-relation factoid questions over Freebase. Each question can acquire the answer from a single fact of form (subject, predicate, object) in Freebase. This task, simple question answering (SimpleQA), can be addressed via a two-step pipeline: entity linking and fact selection. In fact selection, we match the subject entity in a fact candidate with the entity mention in the question by a character-level convolutional neural network (char-CNN), and match the predicate in that fact with the question by a word-level CNN (word-CNN). This work makes two main contributions. (i) A simple and effective entity linker over Freebase is proposed. Our entity linker outperforms the state-of-the-art entity linker over SimpleQA task. (ii) A novel attentive maxpooling is stacked over word-CNN, so that the predicate representation can be matched with the predicate-focused question representation more effectively. Experiments show that our system sets new state-of-the-art in this task.

pdf bib
Pointing the Unknown Words
Caglar Gulcehre | Sungjin Ahn | Ramesh Nallapati | Bowen Zhou | Yoshua Bengio
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Improved Representation Learning for Question Answer Matching
Ming Tan | Cicero dos Santos | Bing Xiang | Bowen Zhou
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
ABCNN: Attention-Based Convolutional Neural Network for Modeling Sentence Pairs
Wenpeng Yin | Hinrich Schütze | Bing Xiang | Bowen Zhou
Transactions of the Association for Computational Linguistics, Volume 4

How to model a pair of sentences is a critical issue in many NLP tasks such as answer selection (AS), paraphrase identification (PI) and textual entailment (TE). Most prior work (i) deals with one individual task by fine-tuning a specific system; (ii) models each sentence’s representation separately, rarely considering the impact of the other sentence; or (iii) relies fully on manually designed, task-specific linguistic features. This work presents a general Attention Based Convolutional Neural Network (ABCNN) for modeling a pair of sentences. We make three contributions. (i) The ABCNN can be applied to a wide variety of tasks that require modeling of sentence pairs. (ii) We propose three attention schemes that integrate mutual influence between sentences into CNNs; thus, the representation of each sentence takes into consideration its counterpart. These interdependent sentence pair representations are more powerful than isolated sentence representations. (iii) ABCNNs achieve state-of-the-art performance on AS, PI and TE tasks. We release code at: https://github.com/yinwenpeng/Answer_Selection.

2015

pdf bib
Efficient Hyper-parameter Optimization for NLP Applications
Lidan Wang | Minwei Feng | Bowen Zhou | Bing Xiang | Sridhar Mahadevan
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
Classifying Relations by Ranking with Convolutional Neural Networks
Cícero dos Santos | Bing Xiang | Bowen Zhou
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

pdf bib
Dependency-based Convolutional Neural Networks for Sentence Embedding
Mingbo Ma | Liang Huang | Bowen Zhou | Bing Xiang
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

2013

pdf bib
Anchor Graph: Global Reordering Contexts for Statistical Machine Translation
Hendra Setiawan | Bowen Zhou | Bing Xiang
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

pdf bib
Flexible and Efficient Hypergraph Interactions for Joint Hierarchical and Forest-to-String Decoding
Martin Čmejrek | Haitao Mi | Bowen Zhou
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

pdf bib
What is Hidden among Translation Rules
Libin Shen | Bowen Zhou
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

pdf bib
A Corpus Level MIRA Tuning Strategy for Machine Translation
Ming Tan | Tian Xia | Shaojun Wang | Bowen Zhou
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

pdf bib
Enlisting the Ghost: Modeling Empty Categories for Machine Translation
Bing Xiang | Xiaoqiang Luo | Bowen Zhou
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Two-Neighbor Orientation Model with Cross-Boundary Global Contexts
Hendra Setiawan | Bowen Zhou | Bing Xiang | Libin Shen
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Discriminative Training of 150 Million Translation Parameters and Its Application to Pruning
Hendra Setiawan | Bowen Zhou
Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

2010

pdf bib
Two Methods for Extending Hierarchical Rules from the Bilingual Chart Parsing
Martin Čmejrek | Bowen Zhou
Coling 2010: Posters

pdf bib
A Power Mean Based Algorithm for Combining Multiple Alignment Tables
Sameer Maskey | Steven Rennie | Bowen Zhou
Coling 2010: Posters

pdf bib
Diversify and Combine: Improving Word Alignment for Machine Translation on Low-Resource Languages
Bing Xiang | Yonggang Deng | Bowen Zhou
Proceedings of the ACL 2010 Conference Short Papers

pdf bib
Soft Syntactic Constraints for Hierarchical Phrase-Based Translation Using Latent Syntactic Distributions
Zhongqiang Huang | Martin Čmejrek | Bowen Zhou
Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing

2009

pdf bib
Optimizing Word Alignment Combination For Phrase Table Training
Yonggang Deng | Bowen Zhou
Proceedings of the ACL-IJCNLP 2009 Conference Short Papers

2008

pdf bib
Prior Derivation Models For Formally Syntax-Based Translation Using Linguistically Syntactic Parsing and Tree Kernels
Bowen Zhou | Bing Xiang | Xiaodan Zhu | Yuqing Gao
Proceedings of the ACL-08: HLT Second Workshop on Syntax and Structure in Statistical Translation (SSST-2)

2006

pdf bib
IBM MASTOR SYSTEM: Multilingual Automatic Speech-to-Speech Translator
Yuqing Gao | Bowen Zhou | Ruhi Sarikaya | Mohamed Afify | Hong-Kwang Kuo | Wei-zhong Zhu | Yonggang Deng | Charles Prosser | Wei Zhang | Laurent Besacier
Proceedings of the First International Workshop on Medical Speech Translation