Jingbo Zhu


2020

pdf bib
Does Multi-Encoder Help? A Case Study on Context-Aware Neural Machine Translation
Bei Li | Hui Liu | Ziyang Wang | Yufan Jiang | Tong Xiao | Jingbo Zhu | Tongran Liu | Changliang Li
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

In encoder-decoder neural models, multiple encoders are in general used to represent the contextual information in addition to the individual sentence. In this paper, we investigate multi-encoder approaches in document-level neural machine translation (NMT). Surprisingly, we find that the context encoder does not only encode the surrounding sentences but also behaves as a noise generator. This makes us rethink the real benefits of multi-encoder in context-aware translation - some of the improvements come from robust training. We compare several methods that introduce noise and/or well-tuned dropout setup into the training of these encoders. Experimental results show that noisy training plays an important role in multi-encoder-based NMT, especially when the training data is small. Also, we establish a new state-of-the-art on IWSLT Fr-En task by careful use of noise generation and dropout methods.

pdf bib
Learning Architectures from an Extended Search Space for Language Modeling
Yinqiao Li | Chi Hu | Yuhao Zhang | Nuo Xu | Yufan Jiang | Tong Xiao | Jingbo Zhu | Tongran Liu | Changliang Li
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Neural architecture search (NAS) has advanced significantly in recent years but most NAS systems restrict search to learning architectures of a recurrent or convolutional cell. In this paper, we extend the search space of NAS. In particular, we present a general approach to learn both intra-cell and inter-cell architectures (call it ESS). For a better search result, we design a joint learning method to perform intra-cell and inter-cell NAS simultaneously. We implement our model in a differentiable architecture search system. For recurrent neural language modeling, it outperforms a strong baseline significantly on the PTB and WikiText data, with a new state-of-the-art on PTB. Moreover, the learned architectures show good transferability to other systems. E.g., they improve state-of-the-art systems on the CoNLL and WNUT named entity recognition (NER) tasks and CoNLL chunking task, indicating a promising line of research on large-scale pre-learned architectures.

pdf bib
The NiuTrans System for WNGT 2020 Efficiency Task
Chi Hu | Bei Li | Yinqiao Li | Ye Lin | Yanyang Li | Chenglong Wang | Tong Xiao | Jingbo Zhu
Proceedings of the Fourth Workshop on Neural Generation and Translation

This paper describes the submissions of the NiuTrans Team to the WNGT 2020 Efficiency Shared Task. We focus on the efficient implementation of deep Transformer models (Wang et al., 2019; Li et al., 2019) using NiuTensor, a flexible toolkit for NLP tasks. We explored the combination of deep encoder and shallow decoder in Transformer models via model compression and knowledge distillation. The neural machine translation decoding also benefits from FP16 inference, attention caching, dynamic batching, and batch pruning. Our systems achieve promising results in both translation quality and efficiency, e.g., our fastest system can translate more than 40,000 tokens per second with an RTX 2080 Ti while maintaining 42.9 BLEU on newstest2018.

pdf bib
Training Flexible Depth Model by Multi-Task Learning for Neural Machine Translation
Qiang Wang | Tong Xiao | Jingbo Zhu
Findings of the Association for Computational Linguistics: EMNLP 2020

The standard neural machine translation model can only decode with the same depth configuration as training. Restricted by this feature, we have to deploy models of various sizes to maintain the same translation latency, because the hardware conditions on different terminal devices (e.g., mobile phones) may vary greatly. Such individual training leads to increased model maintenance costs and slower model iterations, especially for the industry. In this work, we propose to use multi-task learning to train a flexible depth model that can adapt to different depth configurations during inference. Experimental results show that our approach can simultaneously support decoding in 24 depth configurations and is superior to the individual training and another flexible depth model training method——LayerDrop.

pdf bib
Dynamic Curriculum Learning for Low-Resource Neural Machine Translation
Chen Xu | Bojie Hu | Yufan Jiang | Kai Feng | Zeyang Wang | Shen Huang | Qi Ju | Tong Xiao | Jingbo Zhu
Proceedings of the 28th International Conference on Computational Linguistics

Large amounts of data has made neural machine translation (NMT) a big success in recent years. But it is still a challenge if we train these models on small-scale corpora. In this case, the way of using data appears to be more important. Here, we investigate the effective use of training data for low-resource NMT. In particular, we propose a dynamic curriculum learning (DCL) method to reorder training samples in training. Unlike previous work, we do not use a static scoring function for reordering. Instead, the order of training samples is dynamically determined in two ways - loss decline and model competence. This eases training by highlighting easy samples that the current model has enough competence to learn. We test our DCL method in a Transformer-based system. Experimental results show that DCL outperforms several strong baselines on three low-resource machine translation benchmarks and different sized data of WMT’16 En-De.

pdf bib
Layer-Wise Multi-View Learning for Neural Machine Translation
Qiang Wang | Changliang Li | Yue Zhang | Tong Xiao | Jingbo Zhu
Proceedings of the 28th International Conference on Computational Linguistics

Traditional neural machine translation is limited to the topmost encoder layer’s context representation and cannot directly perceive the lower encoder layers. Existing solutions usually rely on the adjustment of network architecture, making the calculation more complicated or introducing additional structural restrictions. In this work, we propose layer-wise multi-view learning to solve this problem, circumventing the necessity to change the model structure. We regard each encoder layer’s off-the-shelf output, a by-product in layer-by-layer encoding, as the redundant view for the input sentence. In this way, in addition to the topmost encoder layer (referred to as the primary view), we also incorporate an intermediate encoder layer as the auxiliary view. We feed the two views to a partially shared decoder to maintain independent prediction. Consistency regularization based on KL divergence is used to encourage the two views to learn from each other. Extensive experimental results on five translation tasks show that our approach yields stable improvements over multiple strong baselines. As another bonus, our method is agnostic to network architectures and can maintain the same inference speed as the original model.

pdf bib
A Simple and Effective Approach to Robust Unsupervised Bilingual Dictionary Induction
Yanyang Li | Yingfeng Luo | Ye Lin | Quan Du | Huizhen Wang | Shujian Huang | Tong Xiao | Jingbo Zhu
Proceedings of the 28th International Conference on Computational Linguistics

Unsupervised Bilingual Dictionary Induction methods based on the initialization and the self-learning have achieved great success in similar language pairs, e.g., English-Spanish. But they still fail and have an accuracy of 0% in many distant language pairs, e.g., English-Japanese. In this work, we show that this failure results from the gap between the actual initialization performance and the minimum initialization performance for the self-learning to succeed. We propose Iterative Dimension Reduction to bridge this gap. Our experiments show that this simple method does not hamper the performance of similar language pairs and achieves an accuracy of 13.64 55.53% between English and four distant languages, i.e., Chinese, Japanese, Vietnamese and Thai.

pdf bib
Shallow-to-Deep Training for Neural Machine Translation
Bei Li | Ziyang Wang | Hui Liu | Yufan Jiang | Quan Du | Tong Xiao | Huizhen Wang | Jingbo Zhu
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Deep encoders have been proven to be effective in improving neural machine translation (NMT) systems, but training an extremely deep encoder is time consuming. Moreover, why deep models help NMT is an open question. In this paper, we investigate the behavior of a well-tuned deep Transformer system. We find that stacking layers is helpful in improving the representation ability of NMT models and adjacent layers perform similarly. This inspires us to develop a shallow-to-deep training method that learns deep models by stacking shallow models. In this way, we successfully train a Transformer system with a 54-layer encoder. Experimental results on WMT’16 English-German and WMT’14 English-French translation tasks show that it is 1:4 faster than training from scratch, and achieves a BLEU score of 30:33 and 43:29 on two tasks. The code is publicly available at https://github.com/libeineu/SDT-Training.

2019

pdf bib
Improved Differentiable Architecture Search for Language Modeling and Named Entity Recognition
Yufan Jiang | Chi Hu | Tong Xiao | Chunliang Zhang | Jingbo Zhu
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

In this paper, we study differentiable neural architecture search (NAS) methods for natural language processing. In particular, we improve differentiable architecture search by removing the softmax-local constraint. Also, we apply differentiable NAS to named entity recognition (NER). It is the first time that differentiable NAS methods are adopted in NLP tasks other than language modeling. On both the PTB language modeling and CoNLL-2003 English NER data, our method outperforms strong baselines. It achieves a new state-of-the-art on the NER task.

pdf bib
Learning Deep Transformer Models for Machine Translation
Qiang Wang | Bei Li | Tong Xiao | Jingbo Zhu | Changliang Li | Derek F. Wong | Lidia S. Chao
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Transformer is the state-of-the-art model in recent machine translation evaluations. Two strands of research are promising to improve models of this kind: the first uses wide networks (a.k.a. Transformer-Big) and has been the de facto standard for development of the Transformer system, and the other uses deeper language representation but faces the difficulty arising from learning deep networks. Here, we continue the line of research on the latter. We claim that a truly deep Transformer model can surpass the Transformer-Big counterpart by 1) proper use of layer normalization and 2) a novel way of passing the combination of previous layers to the next. On WMT’16 English-German and NIST OpenMT’12 Chinese-English tasks, our deep system (30/25-layer encoder) outperforms the shallow Transformer-Big/Base baseline (6-layer encoder) by 0.4-2.4 BLEU points. As another bonus, the deep model is 1.6X smaller in size and 3X faster in training than Transformer-Big.

pdf bib
Shared-Private Bilingual Word Embeddings for Neural Machine Translation
Xuebo Liu | Derek F. Wong | Yang Liu | Lidia S. Chao | Tong Xiao | Jingbo Zhu
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Word embedding is central to neural machine translation (NMT), which has attracted intensive research interest in recent years. In NMT, the source embedding plays the role of the entrance while the target embedding acts as the terminal. These layers occupy most of the model parameters for representation learning. Furthermore, they indirectly interface via a soft-attention mechanism, which makes them comparatively isolated. In this paper, we propose shared-private bilingual word embeddings, which give a closer relationship between the source and target embeddings, and which also reduce the number of model parameters. For similar source and target words, their embeddings tend to share a part of the features and they cooperatively learn these common representation units. Experiments on 5 language pairs belonging to 6 different language families and written in 5 different alphabets demonstrate that the proposed model provides a significant performance boost over the strong baselines with dramatically fewer model parameters.

pdf bib
The NiuTrans Machine Translation Systems for WMT19
Bei Li | Yinqiao Li | Chen Xu | Ye Lin | Jiqiang Liu | Hui Liu | Ziyang Wang | Yuhao Zhang | Nuo Xu | Zeyang Wang | Kai Feng | Hexuan Chen | Tengbo Liu | Yanyang Li | Qiang Wang | Tong Xiao | Jingbo Zhu
Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)

This paper described NiuTrans neural machine translation systems for the WMT 2019 news translation tasks. We participated in 13 translation directions, including 11 supervised tasks, namely EN↔{ZH, DE, RU, KK, LT}, GU→EN and the unsupervised DE↔CS sub-track. Our systems were built on Deep Transformer and several back-translation methods. Iterative knowledge distillation and ensemble+reranking were also employed to obtain stronger models. Our unsupervised submissions were based on NMT enhanced by SMT. As a result, we achieved the highest BLEU scores in {KK↔EN, GU→EN} directions, ranking 2nd in {RU→EN, DE↔CS} and 3rd in {ZH→EN, LT→EN, EN→RU, EN↔DE} among all constrained submissions.

2018

pdf bib
Multi-layer Representation Fusion for Neural Machine Translation
Qiang Wang | Fuxue Li | Tong Xiao | Yanyang Li | Yinqiao Li | Jingbo Zhu
Proceedings of the 27th International Conference on Computational Linguistics

Neural machine translation systems require a number of stacked layers for deep models. But the prediction depends on the sentence representation of the top-most layer with no access to low-level representations. This makes it more difficult to train the model and poses a risk of information loss to prediction. In this paper, we propose a multi-layer representation fusion (MLRF) approach to fusing stacked layers. In particular, we design three fusion functions to learn a better representation from the stack. Experimental results show that our approach yields improvements of 0.92 and 0.56 BLEU points over the strong Transformer baseline on IWSLT German-English and NIST Chinese-English MT tasks respectively. The result is new state-of-the-art in German-English translation.

pdf bib
A Simple and Effective Approach to Coverage-Aware Neural Machine Translation
Yanyang Li | Tong Xiao | Yinqiao Li | Qiang Wang | Changming Xu | Jingbo Zhu
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

We offer a simple and effective method to seek a better balance between model confidence and length preference for Neural Machine Translation (NMT). Unlike the popular length normalization and coverage models, our model does not require training nor reranking the limited n-best outputs. Moreover, it is robust to large beam sizes, which is not well studied in previous work. On the Chinese-English and English-German translation tasks, our approach yields +0.4 1.5 BLEU improvements over the state-of-the-art baselines.

pdf bib
The NiuTrans Machine Translation System for WMT18
Qiang Wang | Bei Li | Jiqiang Liu | Bojian Jiang | Zheyang Zhang | Yinqiao Li | Ye Lin | Tong Xiao | Jingbo Zhu
Proceedings of the Third Conference on Machine Translation: Shared Task Papers

This paper describes the submission of the NiuTrans neural machine translation system for the WMT 2018 Chinese ↔ English news translation tasks. Our baseline systems are based on the Transformer architecture. We further improve the translation performance 2.4-2.6 BLEU points from four aspects, including architectural improvements, diverse ensemble decoding, reranking, and post-processing. Among constrained submissions, we rank 2nd out of 16 submitted systems on Chinese → English task and 3rd out of 16 on English → Chinese task, respectively.

pdf bib
Detecting Free Translation in Parallel Corpora from Attention Scores
Qi Chen | Oi Yee Kwong | Jingbo Zhu
Proceedings of the 32nd Pacific Asia Conference on Language, Information and Computation

2017

pdf bib
Towards Bidirectional Hierarchical Representations for Attention-based Neural Machine Translation
Baosong Yang | Derek F. Wong | Tong Xiao | Lidia S. Chao | Jingbo Zhu
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

This paper proposes a hierarchical attentional neural translation model which focuses on enhancing source-side hierarchical representations by covering both local and global semantic information using a bidirectional tree-based encoder. To maximize the predictive likelihood of target words, a weighted variant of an attention mechanism is used to balance the attentive information between lexical and phrase vectors. Using a tree-based rare word encoding, the proposed model is extended to sub-word level to alleviate the out-of-vocabulary (OOV) problem. Empirical results reveal that the proposed model significantly outperforms sequence-to-sequence attention-based and tree-based neural translation models in English-Chinese translation tasks.

2015

pdf bib
NiuParser: A Chinese Syntactic and Semantic Parsing Toolkit
Jingbo Zhu | Muhua Zhu | Qiang Wang | Tong Xiao
Proceedings of ACL-IJCNLP 2015 System Demonstrations

2014

pdf bib
Tagging The Web: Building A Robust Web Tagger with Neural Network
Ji Ma | Yue Zhang | Jingbo Zhu
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
A Hybrid Approach to Skeleton-based Translation
Tong Xiao | Jingbo Zhu | Chunliang Zhang
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
Punctuation Processing for Projective Dependency Parsing
Ji Ma | Yue Zhang | Jingbo Zhu
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations
Kalina Bontcheva | Jingbo Zhu
Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations

pdf bib
Effective Incorporation of Source Syntax into Hierarchical Phrase-based Translation
Tong Xiao | Adrià de Gispert | Jingbo Zhu | Bill Byrne
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

pdf bib
Syntactic SMT Using a Discriminative Text Generation Model
Yue Zhang | Kai Song | Linfeng Song | Jingbo Zhu | Qun Liu
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

2013

pdf bib
Proceedings of the Seventh SIGHAN Workshop on Chinese Language Processing
Liang-Chih Yu | Yuen-Hsien Tseng | Jingbo Zhu | Fuji Ren
Proceedings of the Seventh SIGHAN Workshop on Chinese Language Processing

pdf bib
Fast and Accurate Shift-Reduce Constituent Parsing
Muhua Zhu | Yue Zhang | Wenliang Chen | Min Zhang | Jingbo Zhu
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Easy-First POS Tagging and Dependency Parsing with Beam Search
Ji Ma | Jingbo Zhu | Tong Xiao | Nan Yang
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

2012

pdf bib
Learning Better Rule Extraction with Translation Span Alignment
Jingbo Zhu | Tong Xiao | Chunliang Zhang
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
NiuTrans: An Open Source Toolkit for Phrase-based and Syntax-based Machine Translation
Tong Xiao | Jingbo Zhu | Hao Zhang | Qiang Li
Proceedings of the ACL 2012 System Demonstrations

pdf bib
NEU Systems in SIGHAN Bakeoff 2012
Ji Ma | LongFei Bai | Zhuo Liu | Ao Zhang | Jingbo Zhu
Proceedings of the Second CIPS-SIGHAN Joint Conference on Chinese Language Processing

pdf bib
Easy-First Chinese POS Tagging and Dependency Parsing
Ji Ma | Tong Xiao | Jingbo Zhu | Feiliang Ren
Proceedings of COLING 2012

pdf bib
Exploiting Lexical Dependencies from Large-Scale Data for Better Shift-Reduce Constituency Parsing
Muhua Zhu | Jingbo Zhu | Huizhen Wang
Proceedings of COLING 2012

2011

pdf bib
Improving Decoding Generalization for Tree-to-String Translation
Jingbo Zhu | Tong Xiao
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Better Automatic Treebank Conversion Using A Feature-Based Approach
Muhua Zhu | Jingbo Zhu | Minghan Hu
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

2010

pdf bib
Mining Large-scale Parallel Corpora from Multilingual Patents: An English-Chinese example and its application to SMT
Bin Lu | Benjamin K. Tsou | Tao Jiang | Oi Yee Kwong | Jingbo Zhu
CIPS-SIGHAN Joint Conference on Chinese Language Processing

pdf bib
High OOV-Recall Chinese Word Segmenter
Xiaoming Xu | Muhua Zhu | Xiaoxu Fei | Jingbo Zhu
CIPS-SIGHAN Joint Conference on Chinese Language Processing

pdf bib
Chinese Syntactic Parsing Evaluation
Qiang Zhou | Jingbo Zhu
CIPS-SIGHAN Joint Conference on Chinese Language Processing

pdf bib
A Multi-stage Clustering Framework for Chinese Personal Name Disambiguation
Huizhen Wang | Haibo Ding | Yingchao Shi | Ji Ma | Xiao Zhou | Jingbo Zhu
CIPS-SIGHAN Joint Conference on Chinese Language Processing

pdf bib
NEUNLPLab Chinese Word Sense Induction System for SIGHAN Bakeoff 2010
Hao Zhang | Tong Xiao | Jingbo Zhu
CIPS-SIGHAN Joint Conference on Chinese Language Processing

pdf bib
Heterogeneous Parsing via Collaborative Decoding
Muhua Zhu | Jingbo Zhu | Tong Xiao
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)

pdf bib
An Empirical Study of Translation Rule Extraction with Multiple Parsers
Tong Xiao | Jingbo Zhu | Hao Zhang | Muhua Zhu
Coling 2010: Posters

pdf bib
Automatic Treebank Conversion via Informed Decoding
Muhua Zhu | Jingbo Zhu
Coling 2010: Posters

pdf bib
Boosting-Based System Combination for Machine Translation
Tong Xiao | Jingbo Zhu | Muhua Zhu | Huizhen Wang
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics

2009

pdf bib
Better Synchronous Binarization for Machine Translation
Tong Xiao | Mu Li | Dongdong Zhang | Jingbo Zhu | Ming Zhou
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

pdf bib
Chinese-English Organization Name Translation Based on Correlative Expansion
Feiliang Ren | Muhua Zhu | Huizhen Wang | Jingbo Zhu
Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration (NEWS 2009)

2008

pdf bib
Learning a Stopping Criterion for Active Learning for Word Sense Disambiguation and Text Classification
Jingbo Zhu | Huizhen Wang | Eduard Hovy
Proceedings of the Third International Joint Conference on Natural Language Processing: Volume-I

pdf bib
Towards Automated Semantic Analysis on Biomedical Research Articles
Donghui Feng | Gully Burns | Jingbo Zhu | Eduard Hovy
Proceedings of the Third International Joint Conference on Natural Language Processing: Volume-II

pdf bib
An Effective Hybrid Machine Learning Approach for Coreference Resolution
Feiliang Ren | Jingbo Zhu
Proceedings of the Sixth SIGHAN Workshop on Chinese Language Processing

pdf bib
Which Performs Better on In-Vocabulary Word Segmentation: Based on Word or Character?
Zhenxing Wang | Changning Huang | Jingbo Zhu
Proceedings of the Sixth SIGHAN Workshop on Chinese Language Processing

pdf bib
The Character-based CRF Segmenter of MSRA&NEU for the 4th Bakeoff
Zhenxing Wang | Changning Huang | Jingbo Zhu
Proceedings of the Sixth SIGHAN Workshop on Chinese Language Processing

pdf bib
Multi-Criteria-Based Strategy to Stop Active Learning for Data Annotation
Jingbo Zhu | Huizhen Wang | Eduard Hovy
Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)

pdf bib
Active Learning with Sampling by Uncertainty and Density for Word Sense Disambiguation and Text Classification
Jingbo Zhu | Huizhen Wang | Tianshun Yao | Benjamin K Tsou
Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)

2007

pdf bib
Active Learning for Word Sense Disambiguation with Methods for Addressing the Class Imbalance Problem
Jingbo Zhu | Eduard Hovy
Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)

2006

pdf bib
Designing Special Post-Processing Rules for SVM-Based Chinese Word Segmentation
Muhua Zhu | Yilin Wang | Zhenxing Wang | Huizhen Wang | Jingbo Zhu
Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing

2005

pdf bib
Using Multiple Discriminant Analysis Approach for Linear Text Segmentation
Jingbo Zhu | Na Ye | Xinzhi Chang | Wenliang Chen | Benjamin K Tsou
Second International Joint Conference on Natural Language Processing: Full Papers

pdf bib
Some Studies on Chinese Domain Knowledge Dictionary and Its Application to Text Classification
Jingbo Zhu | Wenliang Chen
Proceedings of the Fourth SIGHAN Workshop on Chinese Language Processing

2002

pdf bib
A Knowledge-based Approach to Text Classification
Jingbo Zhu | Tianshun Yao
COLING-02: The First SIGHAN Workshop on Chinese Language Processing