Hua Wu


2020

pdf bib
PLATO: Pre-trained Dialogue Generation Model with Discrete Latent Variable
Siqi Bao | Huang He | Fan Wang | Hua Wu | Haifeng Wang
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Pre-training models have been proved effective for a wide range of natural language processing tasks. Inspired by this, we propose a novel dialogue generation pre-training framework to support various kinds of conversations, including chit-chat, knowledge grounded dialogues, and conversational question answering. In this framework, we adopt flexible attention mechanisms to fully leverage the bi-directional context and the uni-directional characteristic of language generation. We also introduce discrete latent variables to tackle the inherent one-to-many mapping problem in response generation. Two reciprocal tasks of response generation and latent act recognition are designed and carried out simultaneously within a shared network. Comprehensive experiments on three publicly available datasets verify the effectiveness and superiority of the proposed framework.

pdf bib
Towards Conversational Recommendation over Multi-Type Dialogs
Zeming Liu | Haifeng Wang | Zheng-Yu Niu | Hua Wu | Wanxiang Che | Ting Liu
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

We focus on the study of conversational recommendation in the context of multi-type dialogs, where the bots can proactively and naturally lead a conversation from a non-recommendation dialog (e.g., QA) to a recommendation dialog, taking into account user’s interests and feedback. To facilitate the study of this task, we create a human-to-human Chinese dialog dataset DuRecDial (about 10k dialogs, 156k utterances), where there are multiple sequential dialogs for a pair of a recommendation seeker (user) and a recommender (bot). In each dialog, the recommender proactively leads a multi-type dialog to approach recommendation targets and then makes multiple recommendations with rich interaction behavior. This dataset allows us to systematically investigate different parts of the overall problem, e.g., how to naturally lead a dialog, how to interact with users for recommendation. Finally we establish baseline results on DuRecDial for future studies.

pdf bib
Conversational Graph Grounded Policy Learning for Open-Domain Conversation Generation
Jun Xu | Haifeng Wang | Zheng-Yu Niu | Hua Wu | Wanxiang Che | Ting Liu
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

To address the challenge of policy learning in open-domain multi-turn conversation, we propose to represent prior information about dialog transitions as a graph and learn a graph grounded dialog policy, aimed at fostering a more coherent and controllable dialog. To this end, we first construct a conversational graph (CG) from dialog corpora, in which there are vertices to represent “what to say” and “how to say”, and edges to represent natural transition between a message (the last utterance in a dialog context) and its response. We then present a novel CG grounded policy learning framework that conducts dialog flow planning by graph traversal, which learns to identify a what-vertex and a how-vertex from the CG at each turn to guide response generation. In this way, we effectively leverage the CG to facilitate policy learning as follows: (1) it enables more effective long-term reward design, (2) it provides high-quality candidate actions, and (3) it gives us more control over the policy. Results on two benchmark corpora demonstrate the effectiveness of this framework.

pdf bib
SKEP: Sentiment Knowledge Enhanced Pre-training for Sentiment Analysis
Hao Tian | Can Gao | Xinyan Xiao | Hao Liu | Bolei He | Hua Wu | Haifeng Wang | Feng Wu
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Recently, sentiment analysis has seen remarkable advance with the help of pre-training approaches. However, sentiment knowledge, such as sentiment words and aspect-sentiment pairs, is ignored in the process of pre-training, despite the fact that they are widely used in traditional sentiment analysis approaches. In this paper, we introduce Sentiment Knowledge Enhanced Pre-training (SKEP) in order to learn a unified sentiment representation for multiple sentiment analysis tasks. With the help of automatically-mined knowledge, SKEP conducts sentiment masking and constructs three sentiment knowledge prediction objectives, so as to embed sentiment information at the word, polarity and aspect level into pre-trained sentiment representation. In particular, the prediction of aspect-sentiment pairs is converted into multi-label classification, aiming to capture the dependency between words in a pair. Experiments on three kinds of sentiment tasks show that SKEP significantly outperforms strong pre-training baseline, and achieves new state-of-the-art results on most of the test datasets. We release our code at https://github.com/baidu/Senta.

pdf bib
Leveraging Graph to Improve Abstractive Multi-Document Summarization
Wei Li | Xinyan Xiao | Jiachen Liu | Hua Wu | Haifeng Wang | Junping Du
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Graphs that capture relations between textual units have great benefits for detecting salient information from multiple documents and generating overall coherent summaries. In this paper, we develop a neural abstractive multi-document summarization (MDS) model which can leverage well-known graph representations of documents such as similarity graph and discourse graph, to more effectively process multiple input documents and produce abstractive summaries. Our model utilizes graphs to encode documents in order to capture cross-document relations, which is crucial to summarizing long documents. Our model can also take advantage of graphs to guide the summary generation process, which is beneficial for generating coherent and concise summaries. Furthermore, pre-trained language models can be easily combined with our model, which further improve the summarization performance significantly. Empirical results on the WikiSum and MultiNews dataset show that the proposed architecture brings substantial improvements over several strong baselines.

pdf bib
Exploring Contextual Word-level Style Relevance for Unsupervised Style Transfer
Chulun Zhou | Liangyu Chen | Jiachen Liu | Xinyan Xiao | Jinsong Su | Sheng Guo | Hua Wu
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Unsupervised style transfer aims to change the style of an input sentence while preserving its original content without using parallel training data. In current dominant approaches, owing to the lack of fine-grained control on the influence from the target style, they are unable to yield desirable output sentences. In this paper, we propose a novel attentional sequence-to-sequence (Seq2seq) model that dynamically exploits the relevance of each output word to the target style for unsupervised style transfer. Specifically, we first pretrain a style classifier, where the relevance of each input word to the original style can be quantified via layer-wise relevance propagation. In a denoising auto-encoding manner, we train an attentional Seq2seq model to reconstruct input sentences and repredict word-level previously-quantified style relevance simultaneously. In this way, this model is endowed with the ability to automatically predict the style relevance of each output word. Then, we equip the decoder of this model with a neural style component to exploit the predicted wordlevel style relevance for better style transfer. Particularly, we fine-tune this model using a carefully-designed objective function involving style transfer, style relevance consistency, content preservation and fluency modeling loss terms. Experimental results show that our proposed model achieves state-of-the-art performance in terms of both transfer accuracy and content preservation.

pdf bib
Syntactic and Semantic-driven Learning for Open Information Extraction
Jialong Tang | Yaojie Lu | Hongyu Lin | Xianpei Han | Le Sun | Xinyan Xiao | Hua Wu
Findings of the Association for Computational Linguistics: EMNLP 2020

One of the biggest bottlenecks in building accurate, high coverage neural open IE systems is the need for large labelled corpora. The diversity of open domain corpora and the variety of natural language expressions further exacerbate this problem. In this paper, we propose a syntactic and semantic-driven learning approach, which can learn neural open IE models without any human-labelled data by leveraging syntactic and semantic knowledge as noisier, higher-level supervision. Specifically, we first employ syntactic patterns as data labelling functions and pretrain a base model using the generated labels. Then we propose a syntactic and semantic-driven reinforcement learning algorithm, which can effectively generalize the base model to open situations with high accuracy. Experimental results show that our approach significantly outperforms the supervised counterparts, and can even achieve competitive performance to supervised state-of-the-art (SoA) model.

pdf bib
Proceedings of the First Workshop on Automatic Simultaneous Translation
Hua Wu | Collin Cherry | Liang Huang | Zhongjun He | Mark Liberman | James Cross | Yang Liu
Proceedings of the First Workshop on Automatic Simultaneous Translation

pdf bib
Learning Adaptive Segmentation Policy for Simultaneous Translation
Ruiqing Zhang | Chuanqiang Zhang | Zhongjun He | Hua Wu | Haifeng Wang
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Balancing accuracy and latency is a great challenge for simultaneous translation. To achieve high accuracy, the model usually needs to wait for more streaming text before translation, which results in increased latency. However, keeping low latency would probably hurt accuracy. Therefore, it is essential to segment the ASR output into appropriate units for translation. Inspired by human interpreters, we propose a novel adaptive segmentation policy for simultaneous translation. The policy learns to segment the source text by considering possible translations produced by the translation model, maintaining consistency between the segmentation and translation. Experimental results on Chinese-English and German-English translation show that our method achieves a better accuracy-latency trade-off over recently proposed state-of-the-art methods.

pdf bib
DuSQL: A Large-Scale and Pragmatic Chinese Text-to-SQL Dataset
Lijie Wang | Ao Zhang | Kun Wu | Ke Sun | Zhenghua Li | Hua Wu | Min Zhang | Haifeng Wang
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Due to the lack of labeled data, previous research on text-to-SQL parsing mainly focuses on English. Representative English datasets include ATIS, WikiSQL, Spider, etc. This paper presents DuSQL, a larges-scale and pragmatic Chinese dataset for the cross-domain text-to-SQL task, containing 200 databases, 813 tables, and 23,797 question/SQL pairs. Our new dataset has three major characteristics. First, by manually analyzing questions from several representative applications, we try to figure out the true distribution of SQL queries in real-life needs. Second, DuSQL contains a considerable proportion of SQL queries involving row or column calculations, motivated by our analysis on the SQL query distributions. Finally, we adopt an effective data construction framework via human-computer collaboration. The basic idea is automatically generating SQL queries based on the SQL grammar and constrained by the given database. This paper describes in detail the construction process and data statistics of DuSQL. Moreover, we present and compare performance of several open-source text-to-SQL parsers with minor modification to accommodate Chinese, including a simple yet effective extension to IRNet for handling calculation SQL queries.

pdf bib
Diversified Multiple Instance Learning for Document-Level Multi-Aspect Sentiment Classification
Yunjie Ji | Hao Liu | Bolei He | Xinyan Xiao | Hua Wu | Yanhua Yu
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Neural Document-level Multi-aspect Sentiment Classification (DMSC) usually requires a lot of manual aspect-level sentiment annotations, which is time-consuming and laborious. As document-level sentiment labeled data are widely available from online service, it is valuable to perform DMSC with such free document-level annotations. To this end, we propose a novel Diversified Multiple Instance Learning Network (D-MILN), which is able to achieve aspect-level sentiment classification with only document-level weak supervision. Specifically, we connect aspect-level and document-level sentiment by formulating this problem as multiple instance learning, providing a way to learn aspect-level classifier from the back propagation of document-level supervision. Two diversified regularizations are further introduced in order to avoid the overfitting on document-level signals during training. Diversified textual regularization encourages the classifier to select aspect-relevant snippets, and diversified sentimental regularization prevents the aspect-level sentiments from being overly consistent with document-level sentiment. Experimental results on TripAdvisor and BeerAdvocate datasets show that D-MILN remarkably outperforms recent weakly-supervised baselines, and is also comparable to the supervised method.

2019

pdf bib
Enhancing Local Feature Extraction with Global Representation for Neural Text Classification
Guocheng Niu | Hengru Xu | Bolei He | Xinyan Xiao | Hua Wu | Sheng Gao
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

For text classification, traditional local feature driven models learn long dependency by deeply stacking or hybrid modeling. This paper proposes a novel Encoder1-Encoder2 architecture, where global information is incorporated into the procedure of local feature extraction from scratch. In particular, Encoder1 serves as a global information provider, while Encoder2 performs as a local feature extractor and is directly fed into the classifier. Meanwhile, two modes are also designed for their interaction. Thanks to the awareness of global information, our method is able to learn better instance specific local features and thus avoids complicated upper operations. Experiments conducted on eight benchmark datasets demonstrate that our proposed architecture promotes local feature driven models by a substantial margin and outperforms the previous best models in the fully-supervised setting.

pdf bib
Multi-agent Learning for Neural Machine Translation
Tianchi Bi | Hao Xiong | Zhongjun He | Hua Wu | Haifeng Wang
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Conventional Neural Machine Translation (NMT) models benefit from the training with an additional agent, e.g., dual learning, and bidirectional decoding with one agent decod- ing from left to right and the other decoding in the opposite direction. In this paper, we extend the training framework to the multi-agent sce- nario by introducing diverse agents in an in- teractive updating process. At training time, each agent learns advanced knowledge from others, and they work together to improve translation quality. Experimental results on NIST Chinese-English, IWSLT 2014 German- English, WMT 2014 English-German and large-scale Chinese-English translation tasks indicate that our approach achieves absolute improvements over the strong baseline sys- tems and shows competitive performance on all tasks.

pdf bib
Knowledge Aware Conversation Generation with Explainable Reasoning over Augmented Graphs
Zhibin Liu | Zheng-Yu Niu | Hua Wu | Haifeng Wang
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Two types of knowledge, triples from knowledge graphs and texts from documents, have been studied for knowledge aware open domain conversation generation, in which graph paths can narrow down vertex candidates for knowledge selection decision, and texts can provide rich information for response generation. Fusion of a knowledge graph and texts might yield mutually reinforcing advantages, but there is less study on that. To address this challenge, we propose a knowledge aware chatting machine with three components, an augmented knowledge graph with both triples and texts, knowledge selector, and knowledge aware response generator. For knowledge selection on the graph, we formulate it as a problem of multi-hop graph reasoning to effectively capture conversation flow, which is more explainable and flexible in comparison with previous works. To fully leverage long text information that differentiates our graph from others, we improve a state of the art reasoning algorithm with machine reading comprehension technology. We demonstrate the effectiveness of our system on two datasets in comparison with state-of-the-art models.

pdf bib
D-NET: A Pre-Training and Fine-Tuning Framework for Improving the Generalization of Machine Reading Comprehension
Hongyu Li | Xiyuan Zhang | Yibing Liu | Yiming Zhang | Quan Wang | Xiangyang Zhou | Jing Liu | Hua Wu | Haifeng Wang
Proceedings of the 2nd Workshop on Machine Reading for Question Answering

In this paper, we introduce a simple system Baidu submitted for MRQA (Machine Reading for Question Answering) 2019 Shared Task that focused on generalization of machine reading comprehension (MRC) models. Our system is built on a framework of pretraining and fine-tuning, namely D-NET. The techniques of pre-trained language models and multi-task learning are explored to improve the generalization of MRC models and we conduct experiments to examine the effectiveness of these strategies. Our system is ranked at top 1 of all the participants in terms of averaged F1 score. Our codes and models will be released at PaddleNLP.

pdf bib
ARNOR: Attention Regularization based Noise Reduction for Distant Supervision Relation Classification
Wei Jia | Dai Dai | Xinyan Xiao | Hua Wu
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Distant supervision is widely used in relation classification in order to create large-scale training data by aligning a knowledge base with an unlabeled corpus. However, it also introduces amounts of noisy labels where a contextual sentence actually does not express the labeled relation. In this paper, we propose ARNOR, a novel Attention Regularization based NOise Reduction framework for distant supervision relation classification. ARNOR assumes that a trustable relation label should be explained by the neural attention model. Specifically, our ARNOR framework iteratively learns an interpretable model and utilizes it to select trustable instances. We first introduce attention regularization to force the model to pay attention to the patterns which explain the relation labels, so as to make the model more interpretable. Then, if the learned model can clearly locate the relation patterns of a candidate instance in the training set, we will select it as a trustable instance for further training step. According to the experiments on NYT data, our ARNOR framework achieves significant improvements over state-of-the-art methods in both relation classification performance and noise reduction effect.

pdf bib
Enhancing Pre-Trained Language Representations with Rich Knowledge for Machine Reading Comprehension
An Yang | Quan Wang | Jing Liu | Kai Liu | Yajuan Lyu | Hua Wu | Qiaoqiao She | Sujian Li
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Machine reading comprehension (MRC) is a crucial and challenging task in NLP. Recently, pre-trained language models (LMs), especially BERT, have achieved remarkable success, presenting new state-of-the-art results in MRC. In this work, we investigate the potential of leveraging external knowledge bases (KBs) to further improve BERT for MRC. We introduce KT-NET, which employs an attention mechanism to adaptively select desired knowledge from KBs, and then fuses selected knowledge with BERT to enable context- and knowledge-aware predictions. We believe this would combine the merits of both deep LMs and curated KBs towards better MRC. Experimental results indicate that KT-NET offers significant and consistent improvements over BERT, outperforming competitive baselines on ReCoRD and SQuAD1.1 benchmarks. Notably, it ranks the 1st place on the ReCoRD leaderboard, and is also the best single model on the SQuAD1.1 leaderboard at the time of submission (March 4th, 2019).

pdf bib
STACL: Simultaneous Translation with Implicit Anticipation and Controllable Latency using Prefix-to-Prefix Framework
Mingbo Ma | Liang Huang | Hao Xiong | Renjie Zheng | Kaibo Liu | Baigong Zheng | Chuanqiang Zhang | Zhongjun He | Hairong Liu | Xing Li | Hua Wu | Haifeng Wang
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Simultaneous translation, which translates sentences before they are finished, is use- ful in many scenarios but is notoriously dif- ficult due to word-order differences. While the conventional seq-to-seq framework is only suitable for full-sentence translation, we pro- pose a novel prefix-to-prefix framework for si- multaneous translation that implicitly learns to anticipate in a single translation model. Within this framework, we present a very sim- ple yet surprisingly effective “wait-k” policy trained to generate the target sentence concur- rently with the source sentence, but always k words behind. Experiments show our strat- egy achieves low latency and reasonable qual- ity (compared to full-sentence translation) on 4 directions: zh↔en and de↔en.

pdf bib
Proactive Human-Machine Conversation with Explicit Conversation Goal
Wenquan Wu | Zhen Guo | Xiangyang Zhou | Hua Wu | Xiyuan Zhang | Rongzhong Lian | Haifeng Wang
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Though great progress has been made for human-machine conversation, current dialogue system is still in its infancy: it usually converses passively and utters words more as a matter of response, rather than on its own initiatives. In this paper, we take a radical step towards building a human-like conversational agent: endowing it with the ability of proactively leading the conversation (introducing a new topic or maintaining the current topic). To facilitate the development of such conversation systems, we create a new dataset named Konv where one acts as a conversation leader and the other acts as the follower. The leader is provided with a knowledge graph and asked to sequentially change the discussion topics, following the given conversation goal, and meanwhile keep the dialogue as natural and engaging as possible. Konv enables a very challenging task as the model needs to both understand dialogue and plan over the given knowledge graph. We establish baseline results on this dataset (about 270K utterances and 30k dialogues) using several state-of-the-art models. Experimental results show that dialogue models that plan over the knowledge graph can make full use of related knowledge to generate more diverse multi-turn conversations. The baseline systems along with the dataset are publicly available.

pdf bib
Know More about Each Other: Evolving Dialogue Strategy via Compound Assessment
Siqi Bao | Huang He | Fan Wang | Rongzhong Lian | Hua Wu
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

In this paper, a novel Generation-Evaluation framework is developed for multi-turn conversations with the objective of letting both participants know more about each other. For the sake of rational knowledge utilization and coherent conversation flow, a dialogue strategy which controls knowledge selection is instantiated and continuously adapted via reinforcement learning. Under the deployed strategy, knowledge grounded conversations are conducted with two dialogue agents. The generated dialogues are comprehensively evaluated on aspects like informativeness and coherence, which are aligned with our objective and human instinct. These assessments are integrated as a compound reward to guide the evolution of dialogue strategy via policy gradient. Comprehensive experiments have been carried out on the publicly available dataset, demonstrating that the proposed method outperforms the other state-of-the-art approaches significantly.

pdf bib
Baidu Neural Machine Translation Systems for WMT19
Meng Sun | Bojian Jiang | Hao Xiong | Zhongjun He | Hua Wu | Haifeng Wang
Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)

In this paper we introduce the systems Baidu submitted for the WMT19 shared task on Chinese<->English news translation. Our systems are based on the Transformer architecture with some effective improvements. Data selection, back translation, data augmentation, knowledge distillation, domain adaptation, model ensemble and re-ranking are employed and proven effective in our experiments. Our Chinese->English system achieved the highest case-sensitive BLEU score among all constrained submissions, and our English->Chinese system ranked the second in all submissions.

2018

pdf bib
Addressing Troublesome Words in Neural Machine Translation
Yang Zhao | Jiajun Zhang | Zhongjun He | Chengqing Zong | Hua Wu
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

One of the weaknesses of Neural Machine Translation (NMT) is in handling lowfrequency and ambiguous words, which we refer as troublesome words. To address this problem, we propose a novel memoryenhanced NMT method. First, we investigate different strategies to define and detect the troublesome words. Then, a contextual memory is constructed to memorize which target words should be produced in what situations. Finally, we design a hybrid model to dynamically access the contextual memory so as to correctly translate the troublesome words. The extensive experiments on Chinese-to-English and English-to-German translation tasks demonstrate that our method significantly outperforms the strong baseline models in translation quality, especially in handling troublesome words.

pdf bib
Multi-Turn Response Selection for Chatbots with Deep Attention Matching Network
Xiangyang Zhou | Lu Li | Daxiang Dong | Yi Liu | Ying Chen | Wayne Xin Zhao | Dianhai Yu | Hua Wu
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Human generates responses relying on semantic and functional dependencies, including coreference relation, among dialogue elements and their context. In this paper, we investigate matching a response with its multi-turn context using dependency information based entirely on attention. Our solution is inspired by the recently proposed Transformer in machine translation (Vaswani et al., 2017) and we extend the attention mechanism in two ways. First, we construct representations of text segments at different granularities solely with stacked self-attention. Second, we try to extract the truly matched segment pairs with attention across the context and response. We jointly introduce those two kinds of attention in one uniform neural network. Experiments on two large-scale multi-turn response selection tasks show that our proposed model significantly outperforms the state-of-the-art models.

pdf bib
Multi-Passage Machine Reading Comprehension with Cross-Passage Answer Verification
Yizhong Wang | Kai Liu | Jing Liu | Wei He | Yajuan Lyu | Hua Wu | Sujian Li | Haifeng Wang
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Machine reading comprehension (MRC) on real web data usually requires the machine to answer a question by analyzing multiple passages retrieved by search engine. Compared with MRC on a single passage, multi-passage MRC is more challenging, since we are likely to get multiple confusing answer candidates from different passages. To address this problem, we propose an end-to-end neural model that enables those answer candidates from different passages to verify each other based on their content representations. Specifically, we jointly train three modules that can predict the final answer based on three factors: the answer boundary, the answer content and the cross-passage answer verification. The experimental results show that our method outperforms the baseline by a large margin and achieves the state-of-the-art performance on the English MS-MARCO dataset and the Chinese DuReader dataset, both of which are designed for MRC in real-world settings.

pdf bib
DuReader: a Chinese Machine Reading Comprehension Dataset from Real-world Applications
Wei He | Kai Liu | Jing Liu | Yajuan Lyu | Shiqi Zhao | Xinyan Xiao | Yuan Liu | Yizhong Wang | Hua Wu | Qiaoqiao She | Xuan Liu | Tian Wu | Haifeng Wang
Proceedings of the Workshop on Machine Reading for Question Answering

This paper introduces DuReader, a new large-scale, open-domain Chinese machine reading comprehension (MRC) dataset, designed to address real-world MRC. DuReader has three advantages over previous MRC datasets: (1) data sources: questions and documents are based on Baidu Search and Baidu Zhidao; answers are manually generated. (2) question types: it provides rich annotations for more question types, especially yes-no and opinion questions, that leaves more opportunity for the research community. (3) scale: it contains 200K questions, 420K answers and 1M documents; it is the largest Chinese MRC dataset so far. Experiments show that human performance is well above current state-of-the-art baseline systems, leaving plenty of room for the community to make improvements. To help the community make these improvements, both DuReader and baseline systems have been posted online. We also organize a shared competition to encourage the exploration of more models. Since the release of the task, there are significant improvements over the baselines.

2017

pdf bib
An End-to-End Model for Question Answering over Knowledge Base with Cross-Attention Combining Global Knowledge
Yanchao Hao | Yuanzhe Zhang | Kang Liu | Shizhu He | Zhanyi Liu | Hua Wu | Jun Zhao
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

With the rapid growth of knowledge bases (KBs) on the web, how to take full advantage of them becomes increasingly important. Question answering over knowledge base (KB-QA) is one of the promising approaches to access the substantial knowledge. Meanwhile, as the neural network-based (NN-based) methods develop, NN-based KB-QA has already achieved impressive results. However, previous work did not put more emphasis on question representation, and the question is converted into a fixed vector regardless of its candidate answers. This simple representation strategy is not easy to express the proper information in the question. Hence, we present an end-to-end neural network model to represent the questions and their corresponding scores dynamically according to the various candidate answer aspects via cross-attention mechanism. In addition, we leverage the global knowledge inside the underlying KB, aiming at integrating the rich KB information into the representation of the answers. As a result, it could alleviates the out-of-vocabulary (OOV) problem, which helps the cross-attention model to represent the question more precisely. The experimental results on WebQuestions demonstrate the effectiveness of the proposed approach.

2016

pdf bib
Multi-view Response Selection for Human-Computer Conversation
Xiangyang Zhou | Daxiang Dong | Hua Wu | Shiqi Zhao | Dianhai Yu | Hao Tian | Xuan Liu | Rui Yan
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf bib
Chinese Poetry Generation with Planning based Neural Network
Zhe Wang | Wei He | Hua Wu | Haiyang Wu | Wei Li | Haifeng Wang | Enhong Chen
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Chinese poetry generation is a very challenging task in natural language processing. In this paper, we propose a novel two-stage poetry generating method which first plans the sub-topics of the poem according to the user’s writing intent, and then generates each line of the poem sequentially, using a modified recurrent neural network encoder-decoder framework. The proposed planning-based method can ensure that the generated poem is coherent and semantically consistent with the user’s intent. A comprehensive evaluation with human judgments demonstrates that our proposed approach outperforms the state-of-the-art poetry generating methods and the poem quality is somehow comparable to human poets.

pdf bib
Latent Topic Embedding
Di Jiang | Lei Shi | Rongzhong Lian | Hua Wu
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Topic modeling and word embedding are two important techniques for deriving latent semantics from data. General-purpose topic models typically work in coarse granularity by capturing word co-occurrence at the document/sentence level. In contrast, word embedding models usually work in much finer granularity by modeling word co-occurrence within small sliding windows. With the aim of deriving latent semantics by considering word co-occurrence at different levels of granularity, we propose a novel model named Latent Topic Embedding (LTE), which seamlessly integrates topic generation and embedding learning in one unified framework. We further propose an efficient Monte Carlo EM algorithm to estimate the parameters of interest. By retaining the individual advantages of topic modeling and word embedding, LTE results in better latent topics and word embedding. Extensive experiments verify the superiority of LTE over the state-of-the-arts.

pdf bib
Active Learning for Dependency Parsing with Partial Annotation
Zhenghua Li | Min Zhang | Yue Zhang | Zhanyi Liu | Wenliang Chen | Hua Wu | Haifeng Wang
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Minimum Risk Training for Neural Machine Translation
Shiqi Shen | Yong Cheng | Zhongjun He | Wei He | Hua Wu | Maosong Sun | Yang Liu
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Semi-Supervised Learning for Neural Machine Translation
Yong Cheng | Wei Xu | Zhongjun He | Wei He | Hua Wu | Maosong Sun | Yang Liu
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

2015

pdf bib
Multi-Task Learning for Multiple Language Translation
Daxiang Dong | Hua Wu | Wei He | Dianhai Yu | Haifeng Wang
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

2014

pdf bib
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Kristina Toutanova | Hua Wu
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
Kristina Toutanova | Hua Wu
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
Policy Learning for Domain Selection in an Extensible Multi-domain Spoken Dialogue System
Zhuoran Wang | Hongliang Chen | Guanchun Wang | Hao Tian | Hua Wu | Haifeng Wang
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

pdf bib
Improve Statistical Machine Translation with Context-Sensitive Bilingual Semantic Embedding Model
Haiyang Wu | Daxiang Dong | Xiaoguang Hu | Dianhai Yu | Wei He | Hua Wu | Haifeng Wang | Ting Liu
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

pdf bib
Transformation from Discontinuous to Continuous Word Alignment Improves Translation Quality
Zhongjun He | Hua Wu | Haifeng Wang | Ting Liu
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

pdf bib
Improving Pivot-Based Statistical Machine Translation by Pivoting the Co-occurrence Count of Phrase Pairs
Xiaoning Zhu | Zhongjun He | Hua Wu | Conghui Zhu | Haifeng Wang | Tiejun Zhao
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

2013

pdf bib
Improving Pivot-Based Statistical Machine Translation Using Random Walk
Xiaoning Zhu | Zhongjun He | Hua Wu | Haifeng Wang | Conghui Zhu | Tiejun Zhao
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

pdf bib
Generalization of Words for Chinese Dependency Parsing
Xianchao Wu | Jie Zhou | Yu Sun | Zhanyi Liu | Dianhai Yu | Hua Wu | Haifeng Wang
Proceedings of the 13th International Conference on Parsing Technologies (IWPT 2013)

2012

pdf bib
Translation Model Adaptation for Statistical Machine Translation with Monolingual Topic Information
Jinsong Su | Hua Wu | Haifeng Wang | Yidong Chen | Xiaodong Shi | Huailin Dong | Qun Liu
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Improve SMT Quality with Automatically Extracted Paraphrase Rules
Wei He | Hua Wu | Haifeng Wang | Ting Liu
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

2011

pdf bib
Reordering with Source Language Collocations
Zhanyi Liu | Haifeng Wang | Hua Wu | Ting Liu | Sheng Li
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

2010

pdf bib
Improving Statistical Machine Translation with Monolingual Collocation
Zhanyi Liu | Haifeng Wang | Hua Wu | Sheng Li
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics

2009

pdf bib
Collocation Extraction Using Monolingual Word Alignment Method
Zhanyi Liu | Haifeng Wang | Hua Wu | Sheng Li
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

pdf bib
Exploiting Heterogeneous Treebanks for Parsing
Zheng-Yu Niu | Haifeng Wang | Hua Wu
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP

pdf bib
Revisiting Pivot Language Approach for Machine Translation
Hua Wu | Haifeng Wang
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP

2008

pdf bib
Domain Adaptation for Statistical Machine Translation with Domain Dictionary and Monolingual Corpora
Hua Wu | Haifeng Wang | Chengqing Zong
Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)

2007

pdf bib
Pivot Language Approach for Phrase-Based Statistical Machine Translation
Hua Wu | Haifeng Wang
Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics

pdf bib
Using RBMT Systems to Produce Bilingual Corpus for SMT
Xiaoguang Hu | Haifeng Wang | Hua Wu
Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)

2006

pdf bib
Word Alignment for Languages with Scarce Resources Using Bilingual Corpora of Other Language Pairs
Haifeng Wang | Hua Wu | Zhanyi Liu
Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions

pdf bib
Boosting Statistical Word Alignment Using Labeled and Unlabeled Data
Hua Wu | Haifeng Wang | Zhanyi Liu
Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions

2005

pdf bib
Improving Statistical Word Alignment with Ensemble Methods
Hua Wu | Haifeng Wang
Second International Joint Conference on Natural Language Processing: Full Papers

pdf bib
Alignment Model Adaptation for Domain-Specific Word Alignment
Hua Wu | Haifeng Wang | Zhanyi Liu
Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05)

2004

pdf bib
Improving Domain-Specific Word Alignment for Computer Assisted Translation
Hua Wu | Haifeng Wang
Proceedings of the ACL Interactive Poster and Demonstration Sessions

pdf bib
Improving Statistical Word Alignment with a Rule-Based Machine Translation System
Hua Wu | Haifeng Wang
COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics

2003

pdf bib
Optimizing Synonym Extraction Using Monolingual and Bilingual Resources
Hua Wu | Ming Zhou
Proceedings of the Second International Workshop on Paraphrasing

pdf bib
Synonymous Collocation Extraction Using Translation Information
Hua Wu | Ming Zhou
Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics

2000

pdf bib
Chinese Generation in a Spoken Dialogue Translation System
Hua Wu | Taiyi Huang | Chengqing Zong
COLING 2000 Volume 2: The 18th International Conference on Computational Linguistics

Search
Co-authors