Jiang Guo


2019

pdf bib
Working Hard or Hardly Working: Challenges of Integrating Typology into Neural Dependency Parsers
Adam Fisch | Jiang Guo | Regina Barzilay
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

This paper explores the task of leveraging typology in the context of cross-lingual dependency parsing. While this linguistic information has shown great promise in pre-neural parsing, results for neural architectures have been mixed. The aim of our investigation is to better understand this state-of-the-art. Our main findings are as follows: 1) The benefit of typological information is derived from coarsely grouping languages into syntactically-homogeneous clusters rather than from learning to leverage variations along individual typological dimensions in a compositional manner; 2) Typology consistent with the actual corpus statistics yields better transfer performance; 3) Typological similarity is only a rough proxy of cross-lingual transferability with respect to parsing.

pdf bib
Cross-Lingual BERT Transformation for Zero-Shot Dependency Parsing
Yuxuan Wang | Wanxiang Che | Jiang Guo | Yijia Liu | Ting Liu
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

This paper investigates the problem of learning cross-lingual representations in a contextual space. We propose Cross-Lingual BERT Transformation (CLBT), a simple and efficient approach to generate cross-lingual contextualized word embeddings based on publicly available pre-trained BERT models (Devlin et al., 2018). In this approach, a linear transformation is learned from contextual word alignments to align the contextualized embeddings independently trained in different languages. We demonstrate the effectiveness of this approach on zero-shot cross-lingual transfer parsing. Experiments show that our embeddings substantially outperform the previous state-of-the-art that uses static embeddings. We further compare our approach with XLM (Lample and Conneau, 2019), a recently proposed cross-lingual language model trained with massive parallel data, and achieve highly competitive results.

pdf bib
GraphIE: A Graph-Based Framework for Information Extraction
Yujie Qian | Enrico Santus | Zhijing Jin | Jiang Guo | Regina Barzilay
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Most modern Information Extraction (IE) systems are implemented as sequential taggers and only model local dependencies. Non-local and non-sequential context is, however, a valuable source of information to improve predictions. In this paper, we introduce GraphIE, a framework that operates over a graph representing a broad set of dependencies between textual units (i.e. words or sentences). The algorithm propagates information between connected nodes through graph convolutions, generating a richer representation that can be exploited to improve word-level predictions. Evaluation on three different tasks — namely textual, social media and visual information extraction — shows that GraphIE consistently outperforms the state-of-the-art sequence tagging model by a significant margin.

2018

pdf bib
Multi-Source Domain Adaptation with Mixture of Experts
Jiang Guo | Darsh Shah | Regina Barzilay
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

We propose a mixture-of-experts approach for unsupervised domain adaptation from multiple sources. The key idea is to explicitly capture the relationship between a target example and different source domains. This relationship, expressed by a point-to-set metric, determines how to combine predictors trained on various domains. The metric is learned in an unsupervised fashion using meta-training. Experimental results on sentiment analysis and part-of-speech tagging demonstrate that our approach consistently outperforms multiple baselines and can robustly handle negative transfer.

2017

pdf bib
The HIT-SCIR System for End-to-End Parsing of Universal Dependencies
Wanxiang Che | Jiang Guo | Yuxuan Wang | Bo Zheng | Huaipeng Zhao | Yang Liu | Dechuan Teng | Ting Liu
Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies

This paper describes our system (HIT-SCIR) for the CoNLL 2017 shared task: Multilingual Parsing from Raw Text to Universal Dependencies. Our system includes three pipelined components: tokenization, Part-of-Speech (POS) tagging and dependency parsing. We use character-based bidirectional long short-term memory (LSTM) networks for both tokenization and POS tagging. Afterwards, we employ a list-based transition-based algorithm for general non-projective parsing and present an improved Stack-LSTM-based architecture for representing each transition state and making predictions. Furthermore, to parse low/zero-resource languages and cross-domain data, we use a model transfer approach to make effective use of existing resources. We demonstrate substantial gains against the UDPipe baseline, with an average improvement of 3.76% in LAS of all languages. And finally, we rank the 4th place on the official test sets.

2016

pdf bib
A Universal Framework for Inductive Transfer Parsing across Multi-typed Treebanks
Jiang Guo | Wanxiang Che | Haifeng Wang | Ting Liu
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Various treebanks have been released for dependency parsing. Despite that treebanks may belong to different languages or have different annotation schemes, they contain common syntactic knowledge that is potential to benefit each other. This paper presents a universal framework for transfer parsing across multi-typed treebanks with deep multi-task learning. We consider two kinds of treebanks as source: the multilingual universal treebanks and the monolingual heterogeneous treebanks. Knowledge across the source and target treebanks are effectively transferred through multi-level parameter sharing. Experiments on several benchmark datasets in various languages demonstrate that our approach can make effective use of arbitrary source treebanks to improve target parsing models.

pdf bib
A Unified Architecture for Semantic Role Labeling and Relation Classification
Jiang Guo | Wanxiang Che | Haifeng Wang | Ting Liu | Jun Xu
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

This paper describes a unified neural architecture for identifying and classifying multi-typed semantic relations between words in a sentence. We investigate two typical and well-studied tasks: semantic role labeling (SRL) which identifies the relations between predicates and arguments, and relation classification (RC) which focuses on the relation between two entities or nominals. While mostly studied separately in prior work, we show that the two tasks can be effectively connected and modeled using a general architecture. Experiments on CoNLL-2009 benchmark datasets show that our SRL models significantly outperform state-of-the-art approaches. Our RC models also yield competitive performance with the best published records. Furthermore, we show that the two tasks can be trained jointly with multi-task learning, resulting in additive significant improvements for SRL.

pdf bib
Chinese Grammatical Error Diagnosis with Long Short-Term Memory Networks
Bo Zheng | Wanxiang Che | Jiang Guo | Ting Liu
Proceedings of the 3rd Workshop on Natural Language Processing Techniques for Educational Applications (NLPTEA2016)

Grammatical error diagnosis is an important task in natural language processing. This paper introduces our Chinese Grammatical Error Diagnosis (CGED) system in the NLP-TEA-3 shared task for CGED. The CGED system can diagnose four types of grammatical errors which are redundant words (R), missing words (M), bad word selection (S) and disordered words (W). We treat the CGED task as a sequence labeling task and describe three models, including a CRF-based model, an LSTM-based model and an ensemble model using stacking. We also show in details how we build and train the models. Evaluation includes three levels, which are detection level, identification level and position level. On the CGED-HSK dataset of NLP-TEA-3 shared task, our system presents the best F1-scores in all the three levels and also the best recall in the last two levels.

2015

pdf bib
Cross-lingual Dependency Parsing Based on Distributed Representations
Jiang Guo | Wanxiang Che | David Yarowsky | Haifeng Wang | Ting Liu
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

2014

pdf bib
Learning Semantic Hierarchies via Word Embeddings
Ruiji Fu | Jiang Guo | Bing Qin | Wanxiang Che | Haifeng Wang | Ting Liu
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Learning Sense-specific Word Embeddings By Exploiting Bilingual Resources
Jiang Guo | Wanxiang Che | Haifeng Wang | Ting Liu
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

pdf bib
Revisiting Embedding Features for Simple Semi-supervised Learning
Jiang Guo | Wanxiang Che | Haifeng Wang | Ting Liu
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

2010

pdf bib
A domain adaption Word Segmenter For Sighan Backoff 2010
Jiang Guo | Wenjie Su | Yangsen Zhang
CIPS-SIGHAN Joint Conference on Chinese Language Processing