Liang Wang


pdf bib
Every Document Owns Its Structure: Inductive Text Classification via Graph Neural Networks
Yufeng Zhang | Xueli Yu | Zeyu Cui | Shu Wu | Zhongzhen Wen | Liang Wang
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Text classification is fundamental in natural language processing (NLP) and Graph Neural Networks (GNN) are recently applied in this task. However, the existing graph-based works can neither capture the contextual word relationships within each document nor fulfil the inductive learning of new words. Therefore in this work, to overcome such problems, we propose TextING for inductive text classification via GNN. We first build individual graphs for each document and then use GNN to learn the fine-grained word representations based on their local structure, which can also effectively produce embeddings for unseen words in the new document. Finally, the word nodes are aggregated as the document embedding. Extensive experiments on four benchmark datasets show that our method outperforms state-of-the-art text classification methods.

pdf bib
Learning distributed sentence vectors with bi-directional 3D convolutions
Bin Liu | Liang Wang | Guosheng Yin
Proceedings of the 28th International Conference on Computational Linguistics

We propose to learn distributed sentence representation using text’s visual features as input. Different from the existing methods that render the words or characters of a sentence into images separately, we further fold these images into a 3-dimensional sentence tensor. Then, multiple 3-dimensional convolutions with different lengths (the third dimension) are applied to the sentence tensor, which act as bi-gram, tri-gram, quad-gram, and even five-gram detectors jointly. Similar to the Bi-LSTM, these n-gram detectors learn both forward and backward distributional semantic knowledge from the sentence tensor. That is, the proposed model using bi-directional convolutions to learn text embedding according to the semantic order of words. The feature maps from the two directions are concatenated for final sentence embedding learning. Our model involves only a single-layer of convolution which makes it easy and fast to train. Finally, we evaluate the sentence embeddings on several downstream Natural Language Processing (NLP) tasks, which demonstrate a surprisingly excellent performance of the proposed model.


pdf bib
Denoising based Sequence-to-Sequence Pre-training for Text Generation
Liang Wang | Wei Zhao | Ruoyu Jia | Sujian Li | Jingming Liu
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

This paper presents a new sequence-to-sequence (seq2seq) pre-training method PoDA (Pre-training of Denoising Autoencoders), which learns representations suitable for text generation tasks. Unlike encoder-only (e.g., BERT) or decoder-only (e.g., OpenAI GPT) pre-training approaches, PoDA jointly pre-trains both the encoder and decoder by denoising the noise-corrupted text, and it also has the advantage of keeping the network architecture unchanged in the subsequent fine-tuning stage. Meanwhile, we design a hybrid model of Transformer and pointer-generator networks as the backbone architecture for PoDA. We conduct experiments on two text generation tasks: abstractive summarization, and grammatical error correction. Results on four datasets show that PoDA can improve model performance over strong baselines without using any task-specific techniques and significantly speed up convergence.

pdf bib
Improving Grammatical Error Correction via Pre-Training a Copy-Augmented Architecture with Unlabeled Data
Wei Zhao | Liang Wang | Kewei Shen | Ruoyu Jia | Jingming Liu
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Neural machine translation systems have become state-of-the-art approaches for Grammatical Error Correction (GEC) task. In this paper, we propose a copy-augmented architecture for the GEC task by copying the unchanged words from the source sentence to the target sentence. Since the GEC suffers from not having enough labeled training data to achieve high accuracy. We pre-train the copy-augmented architecture with a denoising auto-encoder using the unlabeled One Billion Benchmark and make comparisons between the fully pre-trained model and a partially pre-trained model. It is the first time copying words from the source context and fully pre-training a sequence to sequence model are experimented on the GEC task. Moreover, We add token-level and sentence-level multi-task learning for the GEC task. The evaluation results on the CoNLL-2014 test set show that our approach outperforms all recently published state-of-the-art results by a large margin.


pdf bib
Yuanfudao at SemEval-2018 Task 11: Three-way Attention and Relational Knowledge for Commonsense Machine Comprehension
Liang Wang | Meng Sun | Wei Zhao | Kewei Shen | Jingming Liu
Proceedings of The 12th International Workshop on Semantic Evaluation

This paper describes our system for SemEval-2018 Task 11: Machine Comprehension using Commonsense Knowledge. We use Three-way Attentive Networks (TriAN) to model interactions between the passage, question and answers. To incorporate commonsense knowledge, we augment the input with relation embedding from the graph of general knowledge ConceptNet. As a result, our system achieves state-of-the-art performance with 83.95% accuracy on the official test data. Code is publicly available at

pdf bib
Multi-Perspective Context Aggregation for Semi-supervised Cloze-style Reading Comprehension
Liang Wang | Sujian Li | Wei Zhao | Kewei Shen | Meng Sun | Ruoyu Jia | Jingming Liu
Proceedings of the 27th International Conference on Computational Linguistics

Cloze-style reading comprehension has been a popular task for measuring the progress of natural language understanding in recent years. In this paper, we design a novel multi-perspective framework, which can be seen as the joint training of heterogeneous experts and aggregate context information from different perspectives. Each perspective is modeled by a simple aggregation module. The outputs of multiple aggregation modules are fed into a one-timestep pointer network to get the final answer. At the same time, to tackle the problem of insufficient labeled data, we propose an efficient sampling mechanism to automatically generate more training examples by matching the distribution of candidates between labeled and unlabeled data. We conduct our experiments on a recently released cloze-test dataset CLOTH (Xie et al., 2017), which consists of nearly 100k questions designed by professional teachers. Results show that our method achieves new state-of-the-art performance over previous strong baselines.


pdf bib
PKU_ICL at SemEval-2017 Task 10: Keyphrase Extraction with Model Ensemble and External Knowledge
Liang Wang | Sujian Li
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)

This paper presents a system that participated in SemEval 2017 Task 10 (subtask A and subtask B): Extracting Keyphrases and Relations from Scientific Publications (Augenstein et al., 2017). Our proposed approach utilizes external knowledge to enrich feature representation of candidate keyphrase, including Wikipedia, IEEE taxonomy and pre-trained word embeddings etc. Ensemble of unsupervised models, random forest and linear models are used for candidate keyphrase ranking and keyphrase type classification. Our system achieves the 3rd place in subtask A and 4th place in subtask B.

pdf bib
Learning to Rank Semantic Coherence for Topic Segmentation
Liang Wang | Sujian Li | Yajuan Lv | Houfeng Wang
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

Topic segmentation plays an important role for discourse parsing and information retrieval. Due to the absence of training data, previous work mainly adopts unsupervised methods to rank semantic coherence between paragraphs for topic segmentation. In this paper, we present an intuitive and simple idea to automatically create a “quasi” training dataset, which includes a large amount of text pairs from the same or different documents with different semantic coherence. With the training corpus, we design a symmetric CNN neural network to model text pairs and rank the semantic coherence within the learning to rank framework. Experiments show that our algorithm is able to achieve competitive performance over strong baselines on several real-world datasets.


pdf bib
Text-level Discourse Dependency Parsing
Sujian Li | Liang Wang | Ziqiang Cao | Wenjie Li
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)