Hailong Cao


2016

pdf bib
A Distribution-based Model to Learn Bilingual Word Embeddings
Hailong Cao | Tiejun Zhao | Shu Zhang | Yao Meng
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

We introduce a distribution based model to learn bilingual word embeddings from monolingual data. It is simple, effective and does not require any parallel data or any seed lexicon. We take advantage of the fact that word embeddings are usually in form of dense real-valued low-dimensional vector and therefore the distribution of them can be accurately estimated. A novel cross-lingual learning objective is proposed which directly matches the distributions of word embeddings in one language with that in the other language. During the joint learning process, we dynamically estimate the distributions of word embeddings in two languages respectively and minimize the dissimilarity between them through standard back propagation algorithm. Our learned bilingual word embeddings allow to group each word and its translations together in the shared vector space. We demonstrate the utility of the learned embeddings on the task of finding word-to-word translations from monolingual corpora. Our model achieved encouraging performance on data in both related languages and substantially different languages.

2014

pdf bib
A Lexicalized Reordering Model for Hierarchical Phrase-based Translation
Hailong Cao | Dongdong Zhang | Mu Li | Ming Zhou | Tiejun Zhao
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

pdf bib
Soft Dependency Matching for Hierarchical Phrase-based Machine Translation
Hailong Cao | Dongdong Zhang | Ming Zhou | Tiejun Zhao
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

2012

pdf bib
Expected Error Minimization with Ultraconservative Update for SMT
Lemao Liu | Tiejun Zhao | Taro Watanabe | Hailong Cao | Conghui Zhu
Proceedings of COLING 2012: Posters

pdf bib
Locally Training the Log-Linear Model for SMT
Lemao Liu | Hailong Cao | Taro Watanabe | Tiejun Zhao | Mo Yu | Conghui Zhu
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

2010

pdf bib
Syntactic Constraints on Phrase Extraction for Phrase-Based Machine Translation
Hailong Cao | Andrew Finch | Eiichiro Sumita
Proceedings of the 4th Workshop on Syntax and Structure in Statistical Translation

pdf bib
Filtering Syntactic Constraints for Statistical Machine Translation
Hailong Cao | Eiichiro Sumita
Proceedings of the ACL 2010 Conference Short Papers