Kunlong Chen


pdf bib
SpellGCN: Incorporating Phonological and Visual Similarities into Language Models for Chinese Spelling Check
Xingyi Cheng | Weidi Xu | Kunlong Chen | Shaohua Jiang | Feng Wang | Taifeng Wang | Wei Chu | Yuan Qi
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Chinese Spelling Check (CSC) is a task to detect and correct spelling errors in Chinese natural language. Existing methods have made attempts to incorporate the similarity knowledge between Chinese characters. However, they take the similarity knowledge as either an external input resource or just heuristic rules. This paper proposes to incorporate phonological and visual similarity knowledge into language models for CSC via a specialized graph convolutional network (SpellGCN). The model builds a graph over the characters, and SpellGCN is learned to map this graph into a set of inter-dependent character classifiers. These classifiers are applied to the representations extracted by another network, such as BERT, enabling the whole network to be end-to-end trainable. Experiments are conducted on three human-annotated datasets. Our method achieves superior performance against previous models by a large margin.

pdf bib
Towards Fast and Accurate Neural Chinese Word Segmentation with Multi-Criteria Learning
Weipeng Huang | Xingyi Cheng | Kunlong Chen | Taifeng Wang | Wei Chu
Proceedings of the 28th International Conference on Computational Linguistics

The ambiguous annotation criteria lead to divergence of Chinese Word Segmentation (CWS) datasets in various granularities. Multi-criteria Chinese word segmentation aims to capture various annotation criteria among datasets and leverage their common underlying knowledge. In this paper, we propose a domain adaptive segmenter to exploit diverse criteria of various datasets. Our model is based on Bidirectional Encoder Representations from Transformers (BERT), which is responsible for introducing open-domain knowledge. Private and shared projection layers are proposed to capture domain-specific knowledge and common knowledge, respectively. We also optimize computational efficiency via distillation, quantization, and compiler optimization. Experiments show that our segmenter outperforms the previous state of the art (SOTA) models on 10 CWS datasets with superior efficiency.

pdf bib
Question Directed Graph Attention Network for Numerical Reasoning over Text
Kunlong Chen | Weidi Xu | Xingyi Cheng | Zou Xiaochuan | Yuyu Zhang | Le Song | Taifeng Wang | Yuan Qi | Wei Chu
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Numerical reasoning over texts, such as addition, subtraction, sorting and counting, is a challenging machine reading comprehension task, since it requires both natural language understanding and arithmetic computation. To address this challenge, we propose a heterogeneous graph representation for the context of the passage and question needed for such reasoning, and design a question directed graph attention network to drive multi-step numerical reasoning over this context graph. Our model, which combines deep learning and graph reasoning, achieves remarkable results in benchmark datasets such as DROP.


pdf bib
Variational Semi-Supervised Aspect-Term Sentiment Analysis via Transformer
Xingyi Cheng | Weidi Xu | Taifeng Wang | Wei Chu | Weipeng Huang | Kunlong Chen | Junfeng Hu
Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL)

Aspect-term sentiment analysis (ATSA) is a long-standing challenge in natural language process. It requires fine-grained semantical reasoning about a target entity appeared in the text. As manual annotation over the aspects is laborious and time-consuming, the amount of labeled data is limited for supervised learning. This paper proposes a semi-supervised method for the ATSA problem by using the Variational Autoencoder based on Transformer. The model learns the latent distribution via variational inference. By disentangling the latent representation into the aspect-specific sentiment and the lexical context, our method induces the underlying sentiment prediction for the unlabeled data, which then benefits the ATSA classifier. Our method is classifier-agnostic, i.e., the classifier is an independent module and various supervised models can be integrated. Experimental results are obtained on the SemEval 2014 task 4 and show that our method is effective with different the five specific classifiers and outperforms these models by a significant margin.