Chengqing Zong

Also published as: Cheng-qing Zong


2020

pdf bib
Attend, Translate and Summarize: An Efficient Method for Neural Cross-Lingual Summarization
Junnan Zhu | Yu Zhou | Jiajun Zhang | Chengqing Zong
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Cross-lingual summarization aims at summarizing a document in one language (e.g., Chinese) into another language (e.g., English). In this paper, we propose a novel method inspired by the translation pattern in the process of obtaining a cross-lingual summary. We first attend to some words in the source text, then translate them into the target language, and summarize to get the final summary. Specifically, we first employ the encoder-decoder attention distribution to attend to the source words. Second, we present three strategies to acquire the translation probability, which helps obtain the translation candidates for each source word. Finally, each summary word is generated either from the neural distribution or from the translation candidates of source words. Experimental results on Chinese-to-English and English-to-Chinese summarization tasks have shown that our proposed method can significantly outperform the baselines, achieving comparable performance with the state-of-the-art.

pdf bib
CASIA’s System for IWSLT 2020 Open Domain Translation
Qian Wang | Yuchen Liu | Cong Ma | Yu Lu | Yining Wang | Long Zhou | Yang Zhao | Jiajun Zhang | Chengqing Zong
Proceedings of the 17th International Conference on Spoken Language Translation

This paper describes the CASIA’s system for the IWSLT 2020 open domain translation task. This year we participate in both Chinese→Japanese and Japanese→Chinese translation tasks. Our system is neural machine translation system based on Transformer model. We augment the training data with knowledge distillation and back translation to improve the translation performance. Domain data classification and weighted domain model ensemble are introduced to generate the final translation result. We compare and analyze the performance on development data with different model settings and different data processing techniques.

pdf bib
Improving Autoregressive NMT with Non-Autoregressive Model
Long Zhou | Jiajun Zhang | Chengqing Zong
Proceedings of the First Workshop on Automatic Simultaneous Translation

Autoregressive neural machine translation (NMT) models are often used to teach non-autoregressive models via knowledge distillation. However, there are few studies on improving the quality of autoregressive translation (AT) using non-autoregressive translation (NAT). In this work, we propose a novel Encoder-NAD-AD framework for NMT, aiming at boosting AT with global information produced by NAT model. Specifically, under the semantic guidance of source-side context captured by the encoder, the non-autoregressive decoder (NAD) first learns to generate target-side hidden state sequence in parallel. Then the autoregressive decoder (AD) performs translation from left to right, conditioned on source-side and target-side hidden states. Since AD has global information generated by low-latency NAD, it is more likely to produce a better translation with less time delay. Experiments on WMT14 En-De, WMT16 En-Ro, and IWSLT14 De-En translation tasks demonstrate that our framework achieves significant improvements with only 8% speed degeneration over the autoregressive NMT.

pdf bib
Proceedings of the 28th International Conference on Computational Linguistics
Donia Scott | Nuria Bel | Chengqing Zong
Proceedings of the 28th International Conference on Computational Linguistics

pdf bib
Dual Attention Network for Cross-lingual Entity Alignment
Jian Sun | Yu Zhou | Chengqing Zong
Proceedings of the 28th International Conference on Computational Linguistics

Cross-lingual Entity alignment is an essential part of building a knowledge graph, which can help integrate knowledge among different language knowledge graphs. In the real KGs, there exists an imbalance among the information in the same hierarchy of corresponding entities, which results in the heterogeneity of neighborhood structure, making this task challenging. To tackle this problem, we propose a dual attention network for cross-lingual entity alignment (DAEA). Specifically, our dual attention consists of relation-aware graph attention and hierarchical attention. The relation-aware graph attention aims at selectively aggregating multi-hierarchy neighborhood information to alleviate the difference of heterogeneity among counterpart entities. The hierarchical attention adaptively aggregates the low-hierarchy and the high-hierarchy information, which is beneficial to balance the neighborhood information of counterpart entities and distinguish non-counterpart entities with similar structures. Finally, we treat cross-lingual entity alignment as a process of linking prediction. Experimental results on three real-world cross-lingual entity alignment datasets have shown the effectiveness of DAEA.

pdf bib
Distill and Replay for Continual Language Learning
Jingyuan Sun | Shaonan Wang | Jiajun Zhang | Chengqing Zong
Proceedings of the 28th International Conference on Computational Linguistics

Accumulating knowledge to tackle new tasks without necessarily forgetting the old ones is a hallmark of human-like intelligence. But the current dominant paradigm of machine learning is still to train a model that works well on static datasets. When learning tasks in a stream where data distribution may fluctuate, fitting on new tasks often leads to forgetting on the previous ones. We propose a simple yet effective framework that continually learns natural language understanding tasks with one model. Our framework distills knowledge and replays experience from previous tasks when fitting on a new task, thus named DnR (distill and replay). The framework is based on language models and can be smoothly built with different language model architectures. Experimental results demonstrate that DnR outperfoms previous state-of-the-art models in continually learning tasks of the same type but from different domains, as well as tasks of different types. With the distillation method, we further show that it’s possible for DnR to incrementally compress the model size while still outperforming most of the baselines. We hope that DnR could promote the empirical application of continual language learning, and contribute to building human-level language intelligence minimally bothered by catastrophic forgetting.

pdf bib
Knowledge Graph Enhanced Neural Machine Translation via Multi-task Learning on Sub-entity Granularity
Yang Zhao | Lu Xiang | Junnan Zhu | Jiajun Zhang | Yu Zhou | Chengqing Zong
Proceedings of the 28th International Conference on Computational Linguistics

Previous studies combining knowledge graph (KG) with neural machine translation (NMT) have two problems: i) Knowledge under-utilization: they only focus on the entities that appear in both KG and training sentence pairs, making much knowledge in KG unable to be fully utilized. ii) Granularity mismatch: the current KG methods utilize the entity as the basic granularity, while NMT utilizes the sub-word as the granularity, making the KG different to be utilized in NMT. To alleviate above problems, we propose a multi-task learning method on sub-entity granularity. Specifically, we first split the entities in KG and sentence pairs into sub-entity granularity by using joint BPE. Then we utilize the multi-task learning to combine the machine translation task and knowledge reasoning task. The extensive experiments on various translation tasks have demonstrated that our method significantly outperforms the baseline models in both translation quality and handling the entities.

pdf bib
Multimodal Sentence Summarization via Multimodal Selective Encoding
Haoran Li | Junnan Zhu | Jiajun Zhang | Xiaodong He | Chengqing Zong
Proceedings of the 28th International Conference on Computational Linguistics

This paper studies the problem of generating a summary for a given sentence-image pair. Existing multimodal sequence-to-sequence approaches mainly focus on enhancing the decoder by visual signals, while ignoring that the image can improve the ability of the encoder to identify highlights of a news event or a document. Thus, we propose a multimodal selective gate network that considers reciprocal relationships between textual and multi-level visual features, including global image descriptor, activation grids, and object proposals, to select highlights of the event when encoding the source sentence. In addition, we introduce a modality regularization to encourage the summary to capture the highlights embedded in the image more accurately. To verify the generalization of our model, we adopt the multimodal selective gate to the text-based decoder and multimodal-based decoder. Experimental results on a public multimodal sentence summarization dataset demonstrate the advantage of our models over baselines. Further analysis suggests that our proposed multimodal selective gate network can effectively select important information in the input sentence.

pdf bib
A Knowledge-driven Generative Model for Multi-implication Chinese Medical Procedure Entity Normalization
Jinghui Yan | Yining Wang | Lu Xiang | Yu Zhou | Chengqing Zong
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Medical entity normalization, which links medical mentions in the text to entities in knowledge bases, is an important research topic in medical natural language processing. In this paper, we focus on Chinese medical procedure entity normalization. However, nonstandard Chinese expressions and combined procedures present challenges in our problem. The existing strategies relying on the discriminative model are poorly to cope with normalizing combined procedure mentions. We propose a sequence generative framework to directly generate all the corresponding medical procedure entities. we adopt two strategies: category-based constraint decoding and category-based model refining to avoid unrealistic results. The method is capable of linking entities when a mention contains multiple procedure concepts and our comprehensive experiments demonstrate that the proposed model can achieve remarkable improvements over existing baselines, particularly significant in the case of multi-implication Chinese medical procedures.

pdf bib
Dynamic Context Selection for Document-level Neural Machine Translation via Reinforcement Learning
Xiaomian Kang | Yang Zhao | Jiajun Zhang | Chengqing Zong
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Document-level neural machine translation has yielded attractive improvements. However, majority of existing methods roughly use all context sentences in a fixed scope. They neglect the fact that different source sentences need different sizes of context. To address this problem, we propose an effective approach to select dynamic context so that the document-level translation model can utilize the more useful selected context sentences to produce better translations. Specifically, we introduce a selection module that is independent of the translation module to score each candidate context sentence. Then, we propose two strategies to explicitly select a variable number of context sentences and feed them into the translation module. We train the two modules end-to-end via reinforcement learning. A novel reward is proposed to encourage the selection and utilization of dynamic context sentences. Experiments demonstrate that our approach can select adaptive context sentences for different source sentences, and significantly improves the performance of document-level translation methods.

2019

pdf bib
Synchronous Bidirectional Neural Machine Translation
Long Zhou | Jiajun Zhang | Chengqing Zong
Transactions of the Association for Computational Linguistics, Volume 7

Existing approaches to neural machine translation (NMT) generate the target language sequence token-by-token from left to right. However, this kind of unidirectional decoding framework cannot make full use of the target-side future contexts which can be produced in a right-to-left decoding direction, and thus suffers from the issue of unbalanced outputs. In this paper, we introduce a synchronous bidirectional–neural machine translation (SB-NMT) that predicts its outputs using left-to-right and right-to-left decoding simultaneously and interactively, in order to leverage both of the history and future information at the same time. Specifically, we first propose a new algorithm that enables synchronous bidirectional decoding in a single model. Then, we present an interactive decoding model in which left-to-right (right-to-left) generation does not only depend on its previously generated outputs, but also relies on future contexts predicted by right-to-left (left-to-right) decoding. We extensively evaluate the proposed SB-NMT model on large-scale NIST Chinese–English, WMT14 English–German, and WMT18 Russian–English translation tasks. Experimental results demonstrate that our model achieves significant improvements over the strong Transformer model by 3.92, 1.49, and 1.04 BLEU points, respectively, and obtains the state-of-the-art performance on Chinese–English and English–German translation tasks.

pdf bib
Are You for Real? Detecting Identity Fraud via Dialogue Interactions
Weikang Wang | Jiajun Zhang | Qian Li | Chengqing Zong | Zhifei Li
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Identity fraud detection is of great importance in many real-world scenarios such as the financial industry. However, few studies addressed this problem before. In this paper, we focus on identity fraud detection in loan applications and propose to solve this problem with a novel interactive dialogue system which consists of two modules. One is the knowledge graph (KG) constructor organizing the personal information for each loan applicant. The other is structured dialogue management that can dynamically generate a series of questions based on the personal KG to ask the applicants and determine their identity states. We also present a heuristic user simulator based on problem analysis to evaluate our method. Experiments have shown that the trainable dialogue system can effectively detect fraudsters, and achieve higher recognition accuracy compared with rule-based systems. Furthermore, our learned dialogue strategies are interpretable and flexible, which can help promote real-world applications.

pdf bib
Attribute-aware Sequence Network for Review Summarization
Junjie Li | Xuepeng Wang | Dawei Yin | Chengqing Zong
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Review summarization aims to generate a condensed summary for a review or multiple reviews. Existing review summarization systems mainly generate summary only based on review content and neglect the authors’ attributes (e.g., gender, age, and occupation). In fact, when summarizing a review, users with different attributes usually pay attention to specific aspects and have their own word-using habits or writing styles. Therefore, we propose an Attribute-aware Sequence Network (ASN) to take the aforementioned users’ characteristics into account, which includes three modules: an attribute encoder encodes the attribute preferences over the words; an attribute-aware review encoder adopts an attribute-based selective mechanism to select the important information of a review; and an attribute-aware summary decoder incorporates attribute embedding and attribute-specific word-using habits into word prediction. To validate our model, we collect a new dataset TripAtt, comprising 495,440 attribute-review-summary triplets with three kinds of attribute information: gender, age, and travel status. Extensive experiments show that ASN achieves state-of-the-art performance on review summarization in both auto-metric ROUGE and human evaluation.

pdf bib
NCLS: Neural Cross-Lingual Summarization
Junnan Zhu | Qian Wang | Yining Wang | Yu Zhou | Jiajun Zhang | Shaonan Wang | Chengqing Zong
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Cross-lingual summarization (CLS) is the task to produce a summary in one particular language for a source document in a different language. Existing methods simply divide this task into two steps: summarization and translation, leading to the problem of error propagation. To handle that, we present an end-to-end CLS framework, which we refer to as Neural Cross-Lingual Summarization (NCLS), for the first time. Moreover, we propose to further improve NCLS by incorporating two related tasks, monolingual summarization and machine translation, into the training process of CLS under multi-task learning. Due to the lack of supervised CLS data, we propose a round-trip translation strategy to acquire two high-quality large-scale CLS datasets based on existing monolingual summarization datasets. Experimental results have shown that our NCLS achieves remarkable improvement over traditional pipeline methods on both English-to-Chinese and Chinese-to-English CLS human-corrected test sets. In addition, NCLS with multi-task learning can further significantly improve the quality of generated summaries. We make our dataset and code publicly available here: http://www.nlpr.ia.ac.cn/cip/dataset.htm.

pdf bib
Synchronously Generating Two Languages with Interactive Decoding
Yining Wang | Jiajun Zhang | Long Zhou | Yuchen Liu | Chengqing Zong
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

In this paper, we introduce a novel interactive approach to translate a source language into two different languages simultaneously and interactively. Specifically, the generation of one language relies on not only previously generated outputs by itself, but also the outputs predicted in the other language. Experimental results on IWSLT and WMT datasets demonstrate that our method can obtain significant improvements over both conventional Neural Machine Translation (NMT) model and multilingual NMT model.

pdf bib
A Compact and Language-Sensitive Multilingual Translation Method
Yining Wang | Long Zhou | Jiajun Zhang | Feifei Zhai | Jingfang Xu | Chengqing Zong
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Multilingual neural machine translation (Multi-NMT) with one encoder-decoder model has made remarkable progress due to its simple deployment. However, this multilingual translation paradigm does not make full use of language commonality and parameter sharing between encoder and decoder. Furthermore, this kind of paradigm cannot outperform the individual models trained on bilingual corpus in most cases. In this paper, we propose a compact and language-sensitive method for multilingual translation. To maximize parameter sharing, we first present a universal representor to replace both encoder and decoder models. To make the representor sensitive for specific languages, we further introduce language-sensitive embedding, attention, and discriminator with the ability to enhance model performance. We verify our methods on various translation scenarios, including one-to-many, many-to-many and zero-shot. Extensive experiments demonstrate that our proposed methods remarkably outperform strong standard multilingual translation systems on WMT and IWSLT datasets. Moreover, we find that our model is especially helpful in low-resource and zero-shot translation scenarios.

pdf bib
Incremental Learning from Scratch for Task-Oriented Dialogue Systems
Weikang Wang | Jiajun Zhang | Qian Li | Mei-Yuh Hwang | Chengqing Zong | Zhifei Li
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Clarifying user needs is essential for existing task-oriented dialogue systems. However, in real-world applications, developers can never guarantee that all possible user demands are taken into account in the design phase. Consequently, existing systems will break down when encountering unconsidered user needs. To address this problem, we propose a novel incremental learning framework to design task-oriented dialogue systems, or for short Incremental Dialogue System (IDS), without pre-defining the exhaustive list of user needs. Specifically, we introduce an uncertainty estimation module to evaluate the confidence of giving correct responses. If there is high confidence, IDS will provide responses to users. Otherwise, humans will be involved in the dialogue process, and IDS can learn from human intervention through an online learning module. To evaluate our method, we propose a new dataset which simulates unanticipated user needs in the deployment stage. Experiments show that IDS is robust to unconsidered user actions, and can update itself online by smartly selecting only the most effective training data, and hence attains better performance with less annotation cost.

pdf bib
Memory Consolidation for Contextual Spoken Language Understanding with Dialogue Logistic Inference
He Bai | Yu Zhou | Jiajun Zhang | Chengqing Zong
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Dialogue contexts are proven helpful in the spoken language understanding (SLU) system and they are typically encoded with explicit memory representations. However, most of the previous models learn the context memory with only one objective to maximizing the SLU performance, leaving the context memory under-exploited. In this paper, we propose a new dialogue logistic inference (DLI) task to consolidate the context memory jointly with SLU in the multi-task framework. DLI is defined as sorting a shuffled dialogue session into its original logical order and shares the same memory encoder and retrieval mechanism as the SLU model. Our experimental results show that various popular contextual SLU models can benefit from our approach, and improvements are quite impressive, especially in slot filling.

2018

pdf bib
Exploiting Pre-Ordering for Neural Machine Translation
Yang Zhao | Jiajun Zhang | Chengqing Zong
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

pdf bib
One Sentence One Model for Neural Machine Translation
Xiaoqing Li | Jiajun Zhang | Chengqing Zong
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

pdf bib
Associative Multichannel Autoencoder for Multimodal Word Representation
Shaonan Wang | Jiajun Zhang | Chengqing Zong
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

In this paper we address the problem of learning multimodal word representations by integrating textual, visual and auditory inputs. Inspired by the re-constructive and associative nature of human memory, we propose a novel associative multichannel autoencoder (AMA). Our model first learns the associations between textual and perceptual modalities, so as to predict the missing perceptual information of concepts. Then the textual and predicted perceptual representations are fused through reconstructing their original and associated embeddings. Using a gating mechanism our model assigns different weights to each modality according to the different concepts. Results on six benchmark concept similarity tests show that the proposed method significantly outperforms strong unimodal baselines and state-of-the-art multimodal models.

pdf bib
Addressing Troublesome Words in Neural Machine Translation
Yang Zhao | Jiajun Zhang | Zhongjun He | Chengqing Zong | Hua Wu
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

One of the weaknesses of Neural Machine Translation (NMT) is in handling lowfrequency and ambiguous words, which we refer as troublesome words. To address this problem, we propose a novel memoryenhanced NMT method. First, we investigate different strategies to define and detect the troublesome words. Then, a contextual memory is constructed to memorize which target words should be produced in what situations. Finally, we design a hybrid model to dynamically access the contextual memory so as to correctly translate the troublesome words. The extensive experiments on Chinese-to-English and English-to-German translation tasks demonstrate that our method significantly outperforms the strong baseline models in translation quality, especially in handling troublesome words.

pdf bib
Memory, Show the Way: Memory Based Few Shot Word Representation Learning
Jingyuan Sun | Shaonan Wang | Chengqing Zong
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Distributional semantic models (DSMs) generally require sufficient examples for a word to learn a high quality representation. This is in stark contrast with human who can guess the meaning of a word from one or a few referents only. In this paper, we propose Mem2Vec, a memory based embedding learning method capable of acquiring high quality word representations from fairly limited context. Our method directly adapts the representations produced by a DSM with a longterm memory to guide its guess of a novel word. Based on a pre-trained embedding space, the proposed method delivers impressive performance on two challenging few-shot word similarity tasks. Embeddings learned with our method also lead to considerable improvements over strong baselines on NER and sentiment classification.

pdf bib
Three Strategies to Improve One-to-Many Multilingual Translation
Yining Wang | Jiajun Zhang | Feifei Zhai | Jingfang Xu | Chengqing Zong
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Due to the benefits of model compactness, multilingual translation (including many-to-one, many-to-many and one-to-many) based on a universal encoder-decoder architecture attracts more and more attention. However, previous studies show that one-to-many translation based on this framework cannot perform on par with the individually trained models. In this work, we introduce three strategies to improve one-to-many multilingual translation by balancing the shared and unique features. Within the architecture of one decoder for all target languages, we first exploit the use of unique initial states for different target languages. Then, we employ language-dependent positional embeddings. Finally and especially, we propose to divide the hidden cells of the decoder into shared and language-dependent ones. The extensive experiments demonstrate that our proposed methods can obtain remarkable improvements over the strong baselines. Moreover, our strategies can achieve comparable or even better performance than the individually trained translation models.

pdf bib
A Teacher-Student Framework for Maintainable Dialog Manager
Weikang Wang | Jiajun Zhang | Han Zhang | Mei-Yuh Hwang | Chengqing Zong | Zhifei Li
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Reinforcement learning (RL) is an attractive solution for task-oriented dialog systems. However, extending RL-based systems to handle new intents and slots requires a system redesign. The high maintenance cost makes it difficult to apply RL methods to practical systems on a large scale. To address this issue, we propose a practical teacher-student framework to extend RL-based dialog systems without retraining from scratch. Specifically, the “student” is an extended dialog manager based on a new ontology, and the “teacher” is existing resources used for guiding the learning process of the “student”. By specifying constraints held in the new dialog manager, we transfer knowledge of the “teacher” to the “student” without additional resources. Experiments show that the performance of the extended system is comparable to the system trained from scratch. More importantly, the proposed framework makes no assumption about the unsupported intents and slots, which makes it possible to improve RL-based systems incrementally.

pdf bib
MSMO: Multimodal Summarization with Multimodal Output
Junnan Zhu | Haoran Li | Tianshang Liu | Yu Zhou | Jiajun Zhang | Chengqing Zong
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Multimodal summarization has drawn much attention due to the rapid growth of multimedia data. The output of the current multimodal summarization systems is usually represented in texts. However, we have found through experiments that multimodal output can significantly improve user satisfaction for informativeness of summaries. In this paper, we propose a novel task, multimodal summarization with multimodal output (MSMO). To handle this task, we first collect a large-scale dataset for MSMO research. We then propose a multimodal attention model to jointly generate text and select the most relevant image from the multimodal input. Finally, to evaluate multimodal outputs, we construct a novel multimodal automatic evaluation (MMAE) method which considers both intra-modality salience and inter-modality relevance. The experimental results show the effectiveness of MMAE.

pdf bib
Adopting the Word-Pair-Dependency-Triplets with Individual Comparison for Natural Language Inference
Qianlong Du | Chengqing Zong | Keh-Yih Su
Proceedings of the 27th International Conference on Computational Linguistics

This paper proposes to perform natural language inference with Word-Pair-Dependency-Triplets. Most previous DNN-based approaches either ignore syntactic dependency among words, or directly use tree-LSTM to generate sentence representation with irrelevant information. To overcome the problems mentioned above, we adopt Word-Pair-Dependency-Triplets to improve alignment and inference judgment. To be specific, instead of comparing each triplet from one passage with the merged information of another passage, we first propose to perform comparison directly between the triplets of the given passage-pair to make the judgement more interpretable. Experimental results show that the performance of our approach is better than most of the approaches that use tree structures, and is comparable to other state-of-the-art approaches.

pdf bib
Document-level Multi-aspect Sentiment Classification by Jointly Modeling Users, Aspects, and Overall Ratings
Junjie Li | Haitong Yang | Chengqing Zong
Proceedings of the 27th International Conference on Computational Linguistics

Document-level multi-aspect sentiment classification aims to predict user’s sentiment polarities for different aspects of a product in a review. Existing approaches mainly focus on text information. However, the authors (i.e. users) and overall ratings of reviews are ignored, both of which are proved to be significant on interpreting the sentiments of different aspects in this paper. Therefore, we propose a model called Hierarchical User Aspect Rating Network (HUARN) to consider user preference and overall ratings jointly. Specifically, HUARN adopts a hierarchical architecture to encode word, sentence, and document level information. Then, user attention and aspect attention are introduced into building sentence and document level representation. The document representation is combined with user and overall rating information to predict aspect ratings of a review. Diverse aspects are treated differently and a multi-task framework is adopted. Empirical results on two real-world datasets show that HUARN achieves state-of-the-art performances.

pdf bib
Ensure the Correctness of the Summary: Incorporate Entailment Knowledge into Abstractive Sentence Summarization
Haoran Li | Junnan Zhu | Jiajun Zhang | Chengqing Zong
Proceedings of the 27th International Conference on Computational Linguistics

In this paper, we investigate the sentence summarization task that produces a summary from a source sentence. Neural sequence-to-sequence models have gained considerable success for this task, while most existing approaches only focus on improving the informativeness of the summary, which ignore the correctness, i.e., the summary should not contain unrelated information with respect to the source sentence. We argue that correctness is an essential requirement for summarization systems. Considering a correct summary is semantically entailed by the source sentence, we incorporate entailment knowledge into abstractive summarization models. We propose an entailment-aware encoder under multi-task framework (i.e., summarization generation and entailment recognition) and an entailment-aware decoder by entailment Reward Augmented Maximum Likelihood (RAML) training. Experiment results demonstrate that our models significantly outperform baselines from the aspects of informativeness and correctness.

pdf bib
Source Critical Reinforcement Learning for Transferring Spoken Language Understanding to a New Language
He Bai | Yu Zhou | Jiajun Zhang | Liang Zhao | Mei-Yuh Hwang | Chengqing Zong
Proceedings of the 27th International Conference on Computational Linguistics

To deploy a spoken language understanding (SLU) model to a new language, language transferring is desired to avoid the trouble of acquiring and labeling a new big SLU corpus. An SLU corpus is a monolingual corpus with domain/intent/slot labels. Translating the original SLU corpus into the target language is an attractive strategy. However, SLU corpora consist of plenty of semantic labels (slots), which general-purpose translators cannot handle well, not to mention additional culture differences. This paper focuses on the language transferring task given a small in-domain parallel SLU corpus. The in-domain parallel corpus can be used as the first adaptation on the general translator. But more importantly, we show how to use reinforcement learning (RL) to further adapt the adapted translator, where translated sentences with more proper slot tags receive higher rewards. Our reward is derived from the source input sentence exclusively, unlike reward via actor-critical methods or computing reward with a ground truth target sentence. Hence we can adapt the translator the second time, using the big monolingual SLU corpus from the source language. We evaluate our approach on Chinese to English language transferring for SLU systems. The experimental results show that the generated English SLU corpus via adaptation and reinforcement learning gives us over 97% in the slot F1 score and over 84% accuracy in domain classification. It demonstrates the effectiveness of the proposed language transferring method. Compared with naive translation, our proposed method improves domain classification accuracy by relatively 22%, and the slot filling F1 score by relatively more than 71%.

2017

pdf bib
Neural System Combination for Machine Translation
Long Zhou | Wenpeng Hu | Jiajun Zhang | Chengqing Zong
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Neural machine translation (NMT) becomes a new approach to machine translation and generates much more fluent results compared to statistical machine translation (SMT). However, SMT is usually better than NMT in translation adequacy. It is therefore a promising direction to combine the advantages of both NMT and SMT. In this paper, we propose a neural system combination framework leveraging multi-source NMT, which takes as input the outputs of NMT and SMT systems and produces the final translation. Extensive experiments on the Chinese-to-English translation task show that our model archives significant improvement by 5.3 BLEU points over the best single system output and 3.4 BLEU points over the state-of-the-art traditional system combination methods.

pdf bib
Towards Neural Machine Translation with Partially Aligned Corpora
Yining Wang | Yang Zhao | Jiajun Zhang | Chengqing Zong | Zhengshan Xue
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

While neural machine translation (NMT) has become the new paradigm, the parameter optimization requires large-scale parallel data which is scarce in many domains and language pairs. In this paper, we address a new translation scenario in which there only exists monolingual corpora and phrase pairs. We propose a new method towards translation with partially aligned sentence pairs which are derived from the phrase pairs and monolingual corpora. To make full use of the partially aligned corpora, we adapt the conventional NMT training method in two aspects. On one hand, different generation strategies are designed for aligned and unaligned target words. On the other hand, a different objective function is designed to model the partially aligned parts. The experiments demonstrate that our method can achieve a relatively good result in such a translation scenario, and tiny bitexts can boost translation quality to a large extent.

pdf bib
Learning from Parenthetical Sentences for Term Translation in Machine Translation
Guoping Huang | Jiajun Zhang | Yu Zhou | Chengqing Zong
Proceedings of the 9th SIGHAN Workshop on Chinese Language Processing

Terms extensively exist in specific domains, and term translation plays a critical role in domain-specific machine translation (MT) tasks. However, it’s a challenging task to translate them correctly for the huge number of pre-existing terms and the endless new terms. To achieve better term translation quality, it is necessary to inject external term knowledge into the underlying MT system. Fortunately, there are plenty of term translation knowledge in parenthetical sentences on the Internet. In this paper, we propose a simple, straightforward and effective framework to improve term translation by learning from parenthetical sentences. This framework includes: (1) a focused web crawler; (2) a parenthetical sentence filter, acquiring parenthetical sentences including bilingual term pairs; (3) a term translation knowledge extractor, extracting bilingual term translation candidates; (4) a probability learner, generating the term translation table for MT decoders. The extensive experiments demonstrate that our proposed framework significantly improves the translation quality of terms and sentences.

pdf bib
Exploiting Word Internal Structures for Generic Chinese Sentence Representation
Shaonan Wang | Jiajun Zhang | Chengqing Zong
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

We introduce a novel mixed characterword architecture to improve Chinese sentence representations, by utilizing rich semantic information of word internal structures. Our architecture uses two key strategies. The first is a mask gate on characters, learning the relation among characters in a word. The second is a maxpooling operation on words, adaptively finding the optimal mixture of the atomic and compositional word representations. Finally, the proposed architecture is applied to various sentence composition models, which achieves substantial performance gains over baseline models on sentence similarity task.

pdf bib
Multi-modal Summarization for Asynchronous Collection of Text, Image, Audio and Video
Haoran Li | Junnan Zhu | Cong Ma | Jiajun Zhang | Chengqing Zong
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

The rapid increase of the multimedia data over the Internet necessitates multi-modal summarization from collections of text, image, audio and video. In this work, we propose an extractive Multi-modal Summarization (MMS) method which can automatically generate a textual summary given a set of documents, images, audios and videos related to a specific topic. The key idea is to bridge the semantic gaps between multi-modal contents. For audio information, we design an approach to selectively use its transcription. For vision information, we learn joint representations of texts and images using a neural network. Finally, all the multi-modal aspects are considered to generate the textural summary by maximizing the salience, non-redundancy, readability and coverage through budgeted optimization of submodular functions. We further introduce an MMS corpus in English and Chinese. The experimental results on this dataset demonstrate that our method outperforms other competitive baseline methods.

2016

pdf bib
A Bilingual Discourse Corpus and Its Applications
Yang Liu | Jiajun Zhang | Chengqing Zong | Yating Yang | Xi Zhou
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

Existing discourse research only focuses on the monolingual languages and the inconsistency between languages limits the power of the discourse theory in multilingual applications such as machine translation. To address this issue, we design and build a bilingual discource corpus in which we are currently defining and annotating the bilingual elementary discourse units (BEDUs). The BEDUs are then organized into hierarchical structures. Using this discourse style, we have annotated nearly 20K LDC sentences. Finally, we design a bilingual discourse based method for machine translation evaluation and show the effectiveness of our bilingual discourse annotations.

pdf bib
Exploiting Source-side Monolingual Data in Neural Machine Translation
Jiajun Zhang | Chengqing Zong
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf bib
An End-to-End Chinese Discourse Parser with Adaptation to Explicit and Non-explicit Relation Recognition
Xiaomian Kang | Haoran Li | Long Zhou | Jiajun Zhang | Chengqing Zong
Proceedings of the CoNLL-16 shared task

pdf bib
An Empirical Exploration of Skip Connections for Sequential Tagging
Huijia Wu | Jiajun Zhang | Chengqing Zong
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

In this paper, we empirically explore the effects of various kinds of skip connections in stacked bidirectional LSTMs for sequential tagging. We investigate three kinds of skip connections connecting to LSTM cells: (a) skip connections to the gates, (b) skip connections to the internal states and (c) skip connections to the cell outputs. We present comprehensive experiments showing that skip connections to cell outputs outperform the remaining two. Furthermore, we observe that using gated identity functions as skip mappings works pretty well. Based on this novel skip connections, we successfully train deep stacked bidirectional LSTM models and obtain state-of-the-art results on CCG supertagging and comparable results on POS tagging.

2015

pdf bib
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
Chengqing Zong | Michael Strube
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

pdf bib
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)
Chengqing Zong | Michael Strube
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

pdf bib
Domain Adaptation for Syntactic and Semantic Dependency Parsing Using Deep Belief Networks
Haitong Yang | Tao Zhuang | Chengqing Zong
Transactions of the Association for Computational Linguistics, Volume 3

In current systems for syntactic and semantic dependency parsing, people usually define a very high-dimensional feature space to achieve good performance. But these systems often suffer severe performance drops on out-of-domain test data due to the diversity of features of different domains. This paper focuses on how to relieve this domain adaptation problem with the help of unlabeled target domain data. We propose a deep learning method to adapt both syntactic and semantic parsers. With additional unlabeled target domain data, our method can learn a latent feature representation (LFR) that is beneficial to both domains. Experiments on English data in the CoNLL 2009 shared task show that our method largely reduced the performance drop on out-of-domain test data. Moreover, we get a Macro F1 score that is 2.32 points higher than the best system in the CoNLL 2009 shared task in out-of-domain tests.

2014

pdf bib
Bilingually-constrained Phrase Embeddings for Machine Translation
Jiajun Zhang | Shujie Liu | Mu Li | Ming Zhou | Chengqing Zong
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Enhancing Grammatical Cohesion: Generating Transitional Expressions for SMT
Mei Tu | Yu Zhou | Chengqing Zong
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
RNN-based Derivation Structure Prediction for SMT
Feifei Zhai | Jiajun Zhang | Yu Zhou | Chengqing Zong
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
Proceedings of The Third CIPS-SIGHAN Joint Conference on Chinese Language Processing
Le Sun | Chengqing Zong | Min Zhang | Gina-Anne Levow
Proceedings of The Third CIPS-SIGHAN Joint Conference on Chinese Language Processing

pdf bib
A Study on Personal Attributes Extraction Based on the Combination of Sentences Classifications and Rules
Nan-chang Cheng | Cheng-qing Zong | Min Hou | Yong-lin Teng
Proceedings of The Third CIPS-SIGHAN Joint Conference on Chinese Language Processing

pdf bib
Dynamically Integrating Cross-Domain Translation Memory into Phrase-Based Machine Translation during Decoding
Kun Wang | Chengqing Zong | Keh-Yih Su
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

pdf bib
Multi-Predicate Semantic Role Labeling
Haitong Yang | Chengqing Zong
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

2013

pdf bib
Unsupervised Tree Induction for Tree-based Translation
Feifei Zhai | Jiajun Zhang | Yu Zhou | Chengqing Zong
Transactions of the Association for Computational Linguistics, Volume 1

In current research, most tree-based translation models are built directly from parse trees. In this study, we go in another direction and build a translation model with an unsupervised tree structure derived from a novel non-parametric Bayesian model. In the model, we utilize synchronous tree substitution grammars (STSG) to capture the bilingual mapping between language pairs. To train the model efficiently, we develop a Gibbs sampler with three novel Gibbs operators. The sampler is capable of exploring the infinite space of tree structures by performing local changes on the tree nodes. Experimental results show that the string-to-tree translation system using our Bayesian tree structures significantly outperforms the strong baseline string-to-tree system using parse trees.

pdf bib
Large-scale Word Alignment Using Soft Dependency Cohesion Constraints
Zhiguo Wang | Chengqing Zong
Transactions of the Association for Computational Linguistics, Volume 1

Dependency cohesion refers to the observation that phrases dominated by disjoint dependency subtrees in the source language generally do not overlap in the target language. It has been verified to be a useful constraint for word alignment. However, previous work either treats this as a hard constraint or uses it as a feature in discriminative models, which is ineffective for large-scale tasks. In this paper, we take dependency cohesion as a soft constraint, and integrate it into a generative model for large-scale word alignment experiments. We also propose an approximate EM algorithm and a Gibbs sampling algorithm to estimate model parameters in an unsupervised manner. Experiments on large-scale Chinese-English translation tasks demonstrate that our model achieves improvements in both alignment quality and translation quality.

pdf bib
Integrating Translation Memory into Phrase-Based Machine Translation during Decoding
Kun Wang | Chengqing Zong | Keh-Yih Su
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Handling Ambiguities of Bilingual Predicate-Argument Structures for Statistical Machine Translation
Feifei Zhai | Jiajun Zhang | Yu Zhou | Chengqing Zong
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Learning a Phrase-based Translation Model from Monolingual Data with Application to Domain Adaptation
Jiajun Zhang | Chengqing Zong
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
A Novel Translation Framework Based on Rhetorical Structure Theory
Mei Tu | Yu Zhou | Chengqing Zong
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
Dual Training and Dual Prediction for Polarity Classification
Rui Xia | Tao Wang | Xuelei Hu | Shoushan Li | Chengqing Zong
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
A Lattice-based Framework for Joint Chinese Word Segmentation, POS Tagging and Parsing
Zhiguo Wang | Chengqing Zong | Nianwen Xue
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
A Joint Model to Identify and Align Bilingual Named Entities
Yufeng Chen | Chengqing Zong | Keh-Yih Su
Computational Linguistics, Volume 39, Issue 2 - June 2013

pdf bib
A Study of the Effectiveness of Suffixes for Chinese Word Segmentation
Xiaoqing Li | Chengqing Zong | Keh-Yih Su
Proceedings of the 27th Pacific Asia Conference on Language, Information, and Computation (PACLIC 27)

2012

pdf bib
Integrating Surface and Abstract Features for Robust Cross-Domain Chinese Word Segmentation
Xiaoqing Li | Kun Wang | Chengqing Zong | Keh-Yih Su
Proceedings of COLING 2012

pdf bib
Machine Translation by Modeling Predicate-Argument Structure Transformation
Feifei Zhai | Jiajun Zhang | Yu Zhou | Chengqing Zong
Proceedings of COLING 2012

pdf bib
Tree-based Translation without using Parse Trees
Feifei Zhai | Jiajun Zhang | Yu Zhou | Chengqing Zong
Proceedings of COLING 2012

2011

pdf bib
Augmenting String-to-Tree Translation Models with Fuzzy Use of Source-side Syntax
Jiajun Zhang | Feifei Zhai | Chengqing Zong
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing

pdf bib
A Semantic-Specific Model for Chinese Named Entity Translation
Yufeng Chen | Chengqing Zong
Proceedings of 5th International Joint Conference on Natural Language Processing

pdf bib
A POS-based Ensemble Model for Cross-domain Sentiment Classification
Rui Xia | Chengqing Zong
Proceedings of 5th International Joint Conference on Natural Language Processing

pdf bib
Parse Reranking Based on Higher-Order Lexical Dependencies
Zhiguo Wang | Chengqing Zong
Proceedings of 5th International Joint Conference on Natural Language Processing

pdf bib
Automatic Evaluation of Chinese Translation Output: Word-Level or Character-Level?
Maoxi Li | Chengqing Zong | Hwee Tou Ng
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

2010

pdf bib
A Character-Based Joint Model for CIPS-SIGHAN Word Segmentation Bakeoff 2010
Kun Wang | Chengqing Zong | Keh-Yih Su
CIPS-SIGHAN Joint Conference on Chinese Language Processing

pdf bib
Treebank Conversion based Self-training Strategy for Parsing
Zhiguo Wang | Chengqing Zong
CIPS-SIGHAN Joint Conference on Chinese Language Processing

pdf bib
A Novel Reordering Model Based on Multi-layer Phrase for Statistical Machine Translation
Yanqing He | Yu Zhou | Chengqing Zong | Huilin Wang
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)

pdf bib
A Character-Based Joint Model for Chinese Word Segmentation
Kun Wang | Chengqing Zong | Keh-Yih Su
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)

pdf bib
A Minimum Error Weighting Combination Strategy for Chinese Semantic Role Labeling
Tao Zhuang | Chengqing Zong
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)

pdf bib
Phrase Structure Parsing with Dependency Structure
Zhiguo Wang | Chengqing Zong
Coling 2010: Posters

pdf bib
Exploring the Use of Word Relation Features for Sentiment Classification
Rui Xia | Chengqing Zong
Coling 2010: Posters

pdf bib
On Jointly Recognizing and Aligning Bilingual Named Entities
Yufeng Chen | Chengqing Zong | Keh-Yih Su
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics

pdf bib
CASIA-CASSIL: a Chinese Telephone Conversation Corpus in Real Scenarios with Multi-leveled Annotation
Keyan Zhou | Aijun Li | Zhigang Yin | Chengqing Zong
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

CASIA-CASSIL is a large-scale corpus base of Chinese human-human naturally-occurring telephone conversations in restricted domains. The first edition consists of 792 90-second conversations belonging to tourism domain, which are selected from 7,639 spontaneous telephone recordings in real scenarios. The corpus is now being annotated with wide range of linguistic and paralinguistic information in multi-levels. The annotations include Turns, Speaker Gender, Orthographic Transcription, Chinese Syllable, Chinese Phonetic Transcription, Prosodic Boundary, Stress of Sentence, Non-Speech Sounds, Voice Quality, Topic, Dialog-act and Adjacency Pairs, Ill-formedness, and Expressive Emotion as well, 13 levels in total. The abundant annotation will be effective especially for studying Chinese spoken language phenomena. This paper describes the whole process to build the conversation corpus, including collecting and selecting the original data, and the follow-up process such as transcribing, annotating, and so on. CASIA-CASSIL is being extended to a large scale corpus base of annotated Chinese dialogs for spoken Chinese study.

pdf bib
Joint Inference for Bilingual Semantic Role Labeling
Tao Zhuang | Chengqing Zong
Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing

2009

pdf bib
Layer-Based Dependency Parsing
Ping Jian | Chengqing Zong
Proceedings of the 23rd Pacific Asia Conference on Language, Information and Computation, Volume 1

pdf bib
Approach to Selecting Best Development Set for Phrase-Based Statistical Machine Translation
Peng Liu | Yu Zhou | Chengqing Zong
Proceedings of the 23rd Pacific Asia Conference on Language, Information and Computation, Volume 1

pdf bib
A Framework for Effectively Integrating Hard and Soft Syntactic Rules into Phrase Based Translation
Jiajun Zhang | Chengqing Zong
Proceedings of the 23rd Pacific Asia Conference on Language, Information and Computation, Volume 2

pdf bib
Which is More Suitable for Chinese Word Segmentation, the Generative Model or the Discriminative One?
Kun Wang | Chengqing Zong | Keh-Yih Su
Proceedings of the 23rd Pacific Asia Conference on Language, Information and Computation, Volume 2

pdf bib
A Framework of Feature Selection Methods for Text Categorization
Shoushan Li | Rui Xia | Chengqing Zong | Chu-Ren Huang
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP

pdf bib
Tutorial Abstracts of ACL-IJCNLP 2009
Diana McCarthy | Chengqing Zong
Tutorial Abstracts of ACL-IJCNLP 2009

2008

pdf bib
Multi-domain Sentiment Classification
Shoushan Li | Chengqing Zong
Proceedings of ACL-08: HLT, Short Papers

pdf bib
A New Approach to Automatic Document Summarization
Xiaofeng Wu | Chengqing Zong
Proceedings of the Third International Joint Conference on Natural Language Processing: Volume-I

pdf bib
Domain Adaptation for Statistical Machine Translation with Domain Dictionary and Monolingual Corpora
Hua Wu | Haifeng Wang | Chengqing Zong
Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)

pdf bib
Sentence Type Based Reordering Model for Statistical Machine Translation
Jiajun Zhang | Chengqing Zong | Shoushan Li
Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)

2006

pdf bib
A Hybrid Approach to Chinese Base Noun Phrase Chunking
Fang Xu | Chengqing Zong | Jun Zhao
Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing

2005

pdf bib
A Hierarchical Parsing Approach with Punctuation Processing for Long Chinese Sentences
Xing Li | Chengqing Zong | Rile Hu
Companion Volume to the Proceedings of Conference including Posters/Demos and tutorial abstracts

2004

pdf bib
Collecting and Sharing Bilingual Spontaneous Speech Corpora: the ChinFaDial Experiment
Georges Fafiotte | Christian Boitet | Mark Seligman | Chengqing Zong
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

2003

pdf bib
Utterance Segmentation Using Combined Approach Based on Bi-directional N-gram and Maximum Entropy
Ding Liu | Chengqing Zong
Proceedings of the Second SIGHAN Workshop on Chinese Language Processing

2002

pdf bib
Interactive Chinese-to-English Speech Translation Based on Dialogue Management
Chengqing Zong | Bo Xu | Taiyi Huang
Proceedings of the ACL-02 Workshop on Speech-to-Speech Translation: Algorithms and Systems

pdf bib
Chinese Syntactic Parsing Based on Extended GLR Parsing Algorithm with PCFG*
Yan Zhang | Bo Xu | Chengqing Zong
COLING 2002: The 17th International Conference on Computational Linguistics: Project Notes

2000

pdf bib
Chinese Generation in a Spoken Dialogue Translation System
Hua Wu | Taiyi Huang | Chengqing Zong
COLING 2000 Volume 2: The 18th International Conference on Computational Linguistics