Haizhou Li


2020

pdf bib
Modeling Code-Switch Languages Using Bilingual Parallel Corpus
Grandee Lee | Haizhou Li
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Language modeling is the technique to estimate the probability of a sequence of words. A bilingual language model is expected to model the sequential dependency for words across languages, which is difficult due to the inherent lack of suitable training data as well as diverse syntactic structure across languages. We propose a bilingual attention language model (BALM) that simultaneously performs language modeling objective with a quasi-translation objective to model both the monolingual as well as the cross-lingual sequential dependency. The attention mechanism learns the bilingual context from a parallel corpus. BALM achieves state-of-the-art performance on the SEAME code-switch database by reducing the perplexity of 20.5% over the best-reported result. We also apply BALM in bilingual lexicon induction, and language normalization tasks to validate the idea.

2018

pdf bib
Proceedings of the Seventh Named Entities Workshop
Nancy Chen | Rafael E. Banchs | Xiangyu Duan | Min Zhang | Haizhou Li
Proceedings of the Seventh Named Entities Workshop

pdf bib
Named-Entity Tagging and Domain adaptation for Better Customized Translation
Zhongwei Li | Xuancong Wang | Ai Ti Aw | Eng Siong Chng | Haizhou Li
Proceedings of the Seventh Named Entities Workshop

Customized translation need pay spe-cial attention to the target domain ter-minology especially the named-entities for the domain. Adding linguistic features to neural machine translation (NMT) has been shown to benefit translation in many studies. In this paper, we further demonstrate that adding named-entity (NE) feature with named-entity recognition (NER) into the source language produces better translation with NMT. Our experiments show that by just including the different NE classes and boundary tags, we can increase the BLEU score by around 1 to 2 points using the standard test sets from WMT2017. We also show that adding NE tags using NER and applying in-domain adaptation can be combined to further improve customized machine translation.

pdf bib
NEWS 2018 Whitepaper
Nancy Chen | Xiangyu Duan | Min Zhang | Rafael E. Banchs | Haizhou Li
Proceedings of the Seventh Named Entities Workshop

Transliteration is defined as phonetic translation of names across languages. Transliteration of Named Entities (NEs) is necessary in many applications, such as machine translation, corpus alignment, cross-language IR, information extraction and automatic lexicon acquisition. All such systems call for high-performance transliteration, which is the focus of shared task in the NEWS 2018 workshop. The objective of the shared task is to promote machine transliteration research by providing a common benchmarking platform for the community to evaluate the state-of-the-art technologies.

pdf bib
Report of NEWS 2018 Named Entity Transliteration Shared Task
Nancy Chen | Rafael E. Banchs | Min Zhang | Xiangyu Duan | Haizhou Li
Proceedings of the Seventh Named Entities Workshop

This report presents the results from the Named Entity Transliteration Shared Task conducted as part of The Seventh Named Entities Workshop (NEWS 2018) held at ACL 2018 in Melbourne, Australia. Similar to previous editions of NEWS, the Shared Task featured 19 tasks on proper name transliteration, including 13 different languages and two different Japanese scripts. A total of 6 teams from 8 different institutions participated in the evaluation, submitting 424 runs, involving different transliteration methodologies. Four performance metrics were used to report the evaluation results. The NEWS shared task on machine transliteration has successfully achieved its objectives by providing a common ground for the research community to conduct comparative evaluations of state-of-the-art technologies that will benefit the future research and development in this area.

2016

pdf bib
Exploring Convolutional and Recurrent Neural Networks in Sequential Labelling for Dialogue Topic Tracking
Seokhwan Kim | Rafael Banchs | Haizhou Li
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Proceedings of the Sixth Named Entity Workshop
Xiangyu Duan | Rafael E. Banchs | Min Zhang | Haizhou Li | A Kumaran
Proceedings of the Sixth Named Entity Workshop

pdf bib
Evaluating and Combining Name Entity Recognition Systems
Ridong Jiang | Rafael E. Banchs | Haizhou Li
Proceedings of the Sixth Named Entity Workshop

pdf bib
Whitepaper of NEWS 2016 Shared Task on Machine Transliteration
Xiangyu Duan | Min Zhang | Haizhou Li | Rafael Banchs | A Kumaran
Proceedings of the Sixth Named Entity Workshop

pdf bib
Report of NEWS 2016 Machine Transliteration Shared Task
Xiangyu Duan | Rafael Banchs | Min Zhang | Haizhou Li | A. Kumaran
Proceedings of the Sixth Named Entity Workshop

2015

pdf bib
Wikification of Concept Mentions within Spoken Dialogues Using Domain Constraints from Wikipedia
Seokhwan Kim | Rafael E. Banchs | Haizhou Li
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
Proceedings of the Fifth Named Entity Workshop
Xiangyu Duan | Rafael E. Banchs | Min Zhang | Haizhou Li | A Kumaran
Proceedings of the Fifth Named Entity Workshop

pdf bib
Whitepaper of NEWS 2015 Shared Task on Machine Transliteration
Min Zhang | Haizhou Li | Rafael E. Banchs | A Kumaran
Proceedings of the Fifth Named Entity Workshop

pdf bib
Report of NEWS 2015 Machine Transliteration Shared Task
Rafael E. Banchs | Min Zhang | Xiangyu Duan | Haizhou Li | A. Kumaran
Proceedings of the Fifth Named Entity Workshop

pdf bib
Towards Improving Dialogue Topic Tracking Performances with Wikification of Concept Mentions
Seokhwan Kim | Rafael E. Banchs | Haizhou Li
Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue

2014

pdf bib
A Composite Kernel Approach for Dialog Topic Tracking with Structured Domain Knowledge from Wikipedia
Seokhwan Kim | Rafael E. Banchs | Haizhou Li
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

2013

pdf bib
Meaning Unit Segmentation in English and Chinese: a New Approach to Discourse Phenomena
Jennifer Williams | Rafael Banchs | Haizhou Li
Proceedings of the Workshop on Discourse in Machine Translation

pdf bib
Broadcast News Story Segmentation Using Manifold Learning on Latent Topic Distributions
Xiaoming Lu | Lei Xie | Cheung-Chi Leung | Bin Ma | Haizhou Li
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
Modeling of term-distance and term-occurrence information for improving n-gram language model performance
Tze Yuang Chong | Rafael E. Banchs | Eng Siong Chng | Haizhou Li
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

2012

pdf bib
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Haizhou Li | Chin-Yew Lin | Miles Osborne | Gary Geunbae Lee | Jong C. Park
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Utilizing Dependency Language Models for Graph-based Dependency Parsing Models
Wenliang Chen | Min Zhang | Haizhou Li
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Modeling the Translation of Predicate-Argument Structure for SMT
Deyi Xiong | Min Zhang | Haizhou Li
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
Haizhou Li | Chin-Yew Lin | Miles Osborne | Gary Geunbae Lee | Jong C. Park
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
IRIS: a Chat-oriented Dialogue System based on the Vector Space Model
Rafael E. Banchs | Haizhou Li
Proceedings of the ACL 2012 System Demonstrations

pdf bib
Proceedings of the 4th Named Entity Workshop (NEWS) 2012
Min Zhang | Haizhou Li | A Kumaran
Proceedings of the 4th Named Entity Workshop (NEWS) 2012

pdf bib
Whitepaper of NEWS 2012 Shared Task on Machine Transliteration
Min Zhang | Haizhou Li | A Kumaran | Ming Liu
Proceedings of the 4th Named Entity Workshop (NEWS) 2012

pdf bib
Report of NEWS 2012 Machine Transliteration Shared Task
Min Zhang | Haizhou Li | A Kumaran | Ming Liu
Proceedings of the 4th Named Entity Workshop (NEWS) 2012

2011

pdf bib
SMT Helps Bitext Dependency Parsing
Wenliang Chen | Jun’ichi Kazama | Min Zhang | Yoshimasa Tsuruoka | Yujie Zhang | Yiou Wang | Kentaro Torisawa | Haizhou Li
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing

pdf bib
Joint Models for Chinese POS Tagging and Dependency Parsing
Zhenghua Li | Min Zhang | Wanxiang Che | Ting Liu | Wenliang Chen | Haizhou Li
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing

pdf bib
Proceedings of the 3rd Named Entities Workshop (NEWS 2011)
Min Zhang | Haizhou Li | A Kumaran
Proceedings of the 3rd Named Entities Workshop (NEWS 2011)

pdf bib
Report of NEWS 2011 Machine Transliteration Shared Task
Min Zhang | Haizhou Li | A Kumaran | Ming Liu
Proceedings of the 3rd Named Entities Workshop (NEWS 2011)

pdf bib
Whitepaper of NEWS 2011 Shared Task on Machine Transliteration
Min Zhang | A Kumaran | Haizhou Li
Proceedings of the 3rd Named Entities Workshop (NEWS 2011)

pdf bib
CLGVSM: Adapting Generalized Vector Space Model to Cross-lingual Document Clustering
Guoyu Tang | Yunqing Xia | Min Zhang | Haizhou Li | Fang Zheng
Proceedings of 5th International Joint Conference on Natural Language Processing

pdf bib
Joint Alignment and Artificial Data Generation: An Empirical Study of Pivot-based Machine Transliteration
Min Zhang | Xiangyu Duan | Ming Liu | Yunqing Xia | Haizhou Li
Proceedings of 5th International Joint Conference on Natural Language Processing

pdf bib
Enhancing Language Models in Statistical Machine Translation with Backward N-grams and Mutual Information Triggers
Deyi Xiong | Min Zhang | Haizhou Li
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

pdf bib
AM-FM: A Semantic Framework for Translation Quality Assessment
Rafael E. Banchs | Haizhou Li
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

2010

pdf bib
Learning Translation Boundaries for Phrase-Based Decoding
Deyi Xiong | Min Zhang | Haizhou Li
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics

pdf bib
Linguistically Annotated Reordering: Evaluation and Analysis
Deyi Xiong | Min Zhang | Aiti Aw | Haizhou Li
Computational Linguistics, Volume 36, Issue 3 - September 2010

pdf bib
Proceedings of the 2010 Named Entities Workshop
A Kumaran | Haizhou Li
Proceedings of the 2010 Named Entities Workshop

pdf bib
Report of NEWS 2010 Transliteration Generation Shared Task
Haizhou Li | A Kumaran | Min Zhang | Vladimir Pervouchine
Proceedings of the 2010 Named Entities Workshop

pdf bib
Whitepaper of NEWS 2010 Shared Task on Transliteration Generation
Haizhou Li | A Kumaran | Min Zhang | Vladimir Pervouchine
Proceedings of the 2010 Named Entities Workshop

pdf bib
Report of NEWS 2010 Transliteration Mining Shared Task
A Kumaran | Mitesh M. Khapra | Haizhou Li
Proceedings of the 2010 Named Entities Workshop

pdf bib
Whitepaper of NEWS 2010 Shared Task on Transliteration Mining
A Kumaran | Mitesh M. Khapra | Haizhou Li
Proceedings of the 2010 Named Entities Workshop

pdf bib
EM-based Hybrid Model for Bilingual Terminology Extraction from Comparable Corpora
Lianhau Lee | Aiti Aw | Min Zhang | Haizhou Li
Coling 2010: Posters

pdf bib
Improving Name Origin Recognition with Context Features and Unlabelled Data
Vladimir Pervouchine | Min Zhang | Ming Liu | Haizhou Li
Coling 2010: Posters

pdf bib
Machine Transliteration: Leveraging on Third Languages
Min Zhang | Xiangyu Duan | Vladimir Pervouchine | Haizhou Li
Coling 2010: Posters

pdf bib
Pseudo-Word for Phrase-Based Machine Translation
Xiangyu Duan | Min Zhang | Haizhou Li
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics

pdf bib
Error Detection for Statistical Machine Translation Using Linguistic Features
Deyi Xiong | Min Zhang | Haizhou Li
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics

pdf bib
Convolution Kernel over Packed Parse Forest
Min Zhang | Hui Zhang | Haizhou Li
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics

pdf bib
Non-Isomorphic Forest Pair Translation
Hui Zhang | Min Zhang | Haizhou Li | Eng Siong Chng
Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing

2009

pdf bib
Tree Kernel-based SVM with Structured Syntactic Knowledge for BTG-based Phrase Reordering
Min Zhang | Haizhou Li
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

pdf bib
Fast Translation Rule Matching for Syntax-based Statistical Machine Translation
Hui Zhang | Min Zhang | Haizhou Li | Chew Lim Tan
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

pdf bib
K-Best Combination of Syntactic Parsers
Hui Zhang | Min Zhang | Chew Lim Tan | Haizhou Li
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

pdf bib
Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration (NEWS 2009)
Haizhou Li | A Kumaran
Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration (NEWS 2009)

pdf bib
Report of NEWS 2009 Machine Transliteration Shared Task
Haizhou Li | A Kumaran | Vladimir Pervouchine | Min Zhang
Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration (NEWS 2009)

pdf bib
Whitepaper of NEWS 2009 Machine Transliteration Shared Task
Haizhou Li | A Kumaran | Min Zhang | Vladimir Pervouchine
Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration (NEWS 2009)

pdf bib
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP
Keh-Yih Su | Jian Su | Janyce Wiebe | Haizhou Li
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP

pdf bib
Transliteration Alignment
Vladimir Pervouchine | Haizhou Li | Bo Lin
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP

pdf bib
Forest-based Tree Sequence to String Translation Model
Hui Zhang | Min Zhang | Haizhou Li | Aiti Aw | Chew Lim Tan
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP

pdf bib
A Syntax-Driven Bracketing Model for Phrase-Based Translation
Deyi Xiong | Min Zhang | Aiti Aw | Haizhou Li
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP

pdf bib
Topological Ordering of Function Words in Hierarchical Phrase-based Translation
Hendra Setiawan | Min-Yen Kan | Haizhou Li | Philip Resnik
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP

pdf bib
A Comparative Study of Hypothesis Alignment and its Improvement for Machine Translation System Combination
Boxing Chen | Min Zhang | Haizhou Li | Aiti Aw
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP

pdf bib
Proceedings of the ACL-IJCNLP 2009 Conference Short Papers
Keh-Yih Su | Jian Su | Janyce Wiebe | Haizhou Li
Proceedings of the ACL-IJCNLP 2009 Conference Short Papers

pdf bib
MARS: Multilingual Access and Retrieval System with Enhanced Query Translation and Document Retrieval
Lianhau Lee | Aiti Aw | Thuy Vu | Sharifah Aljunied Mahani | Min Zhang | Haizhou Li
Proceedings of the ACL-IJCNLP 2009 Software Demonstrations

2008

pdf bib
A Tree Sequence Alignment-based Tree-to-Tree Translation Model
Min Zhang | Hongfei Jiang | Aiti Aw | Haizhou Li | Chew Lim Tan | Sheng Li
Proceedings of ACL-08: HLT

pdf bib
A Linguistically Annotated Reordering Model for BTG-based Statistical Machine Translation
Deyi Xiong | Min Zhang | Aiti Aw | Haizhou Li
Proceedings of ACL-08: HLT, Short Papers

pdf bib
Exploiting N-best Hypotheses for SMT Self-Enhancement
Boxing Chen | Min Zhang | Aiti Aw | Haizhou Li
Proceedings of ACL-08: HLT, Short Papers

pdf bib
Name Origin Recognition Using Maximum Entropy Model and Diverse Features
Min Zhang | Chengjie Sun | Haizhou Li | AiTi Aw | Chew Lim Tan | Xiaolong Wang
Proceedings of the Third International Joint Conference on Natural Language Processing: Volume-I

pdf bib
Multi-View Co-Training of Transliteration Model
Jin-Shea Kuo | Haizhou Li
Proceedings of the Third International Joint Conference on Natural Language Processing: Volume-I

pdf bib
Mining Transliterations from Web Query Results: An Incremental Approach
Jin-Shea Kuo | Haizhou Li | Chih-Lung Lin
Proceedings of the Sixth SIGHAN Workshop on Chinese Language Processing

pdf bib
NIST 2007 Language Recognition Evaluation: From the Perspective of IIR
Haizhou Li | Bin Ma | Kong-Aik Lee | Khe-Chai Sim | Hanwu Sun | Rong Tong | Donglai Zhu | Changhuai You
Proceedings of the 22nd Pacific Asia Conference on Language, Information and Computation

pdf bib
Regenerating Hypotheses for Statistical Machine Translation
Boxing Chen | Min Zhang | Aiti Aw | Haizhou Li
Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)

pdf bib
Linguistically Annotated BTG for Statistical Machine Translation
Deyi Xiong | Min Zhang | Aiti Aw | Haizhou Li
Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)

pdf bib
Grammar Comparison Study for Translational Equivalence Modeling and Statistical Machine Translation
Min Zhang | Hongfei Jiang | Haizhou Li | Aiti Aw | Sheng Li
Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)

2007

pdf bib
Semantic Transliteration of Personal Names
Haizhou Li | Khe Chai Sim | Jin-Shea Kuo | Minghui Dong
Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics

pdf bib
Ordering Phrases with Function Words
Hendra Setiawan | Min-Yen Kan | Haizhou Li
Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics

pdf bib
A Statistical Language Modeling Approach to Lattice-Based Spoken Document Retrieval
Tee Kiah Chia | Haizhou Li | Hwee Tou Ng
Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)

2006

pdf bib
Learning Transliteration Lexicons from the Web
Jin-Shea Kuo | Haizhou Li | Ying-Kuei Yang
Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics

pdf bib
A Comparative Study of Four Language Identification Systems
Bin Ma | Haizhou Li
International Journal of Computational Linguistics & Chinese Language Processing, Volume 11, Number 2, June 2006

2005

pdf bib
Phrase-Based Statistical Machine Translation: A Level of Detail Approach
Hendra Setiawan | Haizhou Li | Min Zhang | Beng Chin Ooi
Second International Joint Conference on Natural Language Processing: Full Papers

pdf bib
A Phrase-Based Context-Dependent Joint Probability Model for Named Entity Translation
Min Zhang | Haizhou Li | Jian Su | Hendra Setiawan
Second International Joint Conference on Natural Language Processing: Full Papers

pdf bib
A Phonotactic Language Model for Spoken Language Identification
Haizhou Li | Bin Ma
Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05)

2004

pdf bib
A Joint Source-Channel Model for Machine Transliteration
Haizhou Li | Min Zhang | Jian Su
Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04)

pdf bib
Direct Orthographical Mapping for Machine Transliteration
Min Zhang | Haizhou Li | Jian Su
COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics

1998

pdf bib
Chinese Word Segmentation
Haizhou Li | Baosheng Yuan
Proceedings of the 12th Pacific Asia Conference on Language, Information and Computation