Bing Qin


2020

pdf bib
How Does Selective Mechanism Improve Self-Attention Networks?
Xinwei Geng | Longyue Wang | Xing Wang | Bing Qin | Ting Liu | Zhaopeng Tu
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Self-attention networks (SANs) with selective mechanism has produced substantial improvements in various NLP tasks by concentrating on a subset of input words. However, the underlying reasons for their strong performance have not been well explained. In this paper, we bridge the gap by assessing the strengths of selective SANs (SSANs), which are implemented with a flexible and universal Gumbel-Softmax. Experimental results on several representative NLP tasks, including natural language inference, semantic role labelling, and machine translation, show that SSANs consistently outperform the standard SANs. Through well-designed probing experiments, we empirically validate that the improvement of SSANs can be attributed in part to mitigating two commonly-cited weaknesses of SANs: word order encoding and structure modeling. Specifically, the selective mechanism improves SANs by paying more attention to content words that contribute to the meaning of the sentence.

pdf bib
HIT-SCIR at SemEval-2020 Task 5: Training Pre-trained Language Model with Pseudo-labeling Data for Counterfactuals Detection
Xiao Ding | Dingkui Hao | Yuewei Zhang | Kuo Liao | Zhongyang Li | Bing Qin | Ting Liu
Proceedings of the Fourteenth Workshop on Semantic Evaluation

We describe our system for Task 5 of SemEval 2020: Modelling Causal Reasoning in Language: Detecting Counterfactuals. Despite deep learning has achieved significant success in many fields, it still hardly drives today’s AI to strong AI, as it lacks of causation, which is a fundamental concept in human thinking and reasoning. In this task, we dedicate to detecting causation, especially counterfactuals from texts. We explore multiple pre-trained models to learn basic features and then fine-tune models with counterfactual data and pseudo-labeling data. Our team HIT-SCIR wins the first place (1st) in Sub-task 1 — Detecting Counterfactual Statements and is ranked 4th in Sub-task 2 — Detecting Antecedent and Consequence. In this paper we provide a detailed description of the approach, as well as the results obtained in this task.

pdf bib
Revisiting Pre-Trained Models for Chinese Natural Language Processing
Yiming Cui | Wanxiang Che | Ting Liu | Bing Qin | Shijin Wang | Guoping Hu
Findings of the Association for Computational Linguistics: EMNLP 2020

Bidirectional Encoder Representations from Transformers (BERT) has shown marvelous improvements across various NLP tasks, and consecutive variants have been proposed to further improve the performance of the pre-trained language models. In this paper, we target on revisiting Chinese pre-trained language models to examine their effectiveness in a non-English language and release the Chinese pre-trained language model series to the community. We also propose a simple but effective model called MacBERT, which improves upon RoBERTa in several ways, especially the masking strategy that adopts MLM as correction (Mac). We carried out extensive experiments on eight Chinese NLP tasks to revisit the existing pre-trained language models as well as the proposed MacBERT. Experimental results show that MacBERT could achieve state-of-the-art performances on many NLP tasks, and we also ablate details with several findings that may help future research. https://github.com/ymcui/MacBERT

pdf bib
CodeBERT: A Pre-Trained Model for Programming and Natural Languages
Zhangyin Feng | Daya Guo | Duyu Tang | Nan Duan | Xiaocheng Feng | Ming Gong | Linjun Shou | Bing Qin | Ting Liu | Daxin Jiang | Ming Zhou
Findings of the Association for Computational Linguistics: EMNLP 2020

We present CodeBERT, a bimodal pre-trained model for programming language (PL) and natural language (NL). CodeBERT learns general-purpose representations that support downstream NL-PL applications such as natural language code search, code documentation generation, etc. We develop CodeBERT with Transformer-based neural architecture, and train it with a hybrid objective function that incorporates the pre-training task of replaced token detection, which is to detect plausible alternatives sampled from generators. This enables us to utilize both “bimodal” data of NL-PL pairs and “unimodal data, where the former provides input tokens for model training while the latter helps to learn better generators. We evaluate CodeBERT on two NL-PL applications by fine-tuning model parameters. Results show that CodeBERT achieves state-of-the-art performance on both natural language code search and code documentation generation. Furthermore, to investigate what type of knowledge is learned in CodeBERT, we construct a dataset for NL-PL probing, and evaluate in a zero-shot setting where parameters of pre-trained models are fixed. Results show that CodeBERT performs better than previous pre-trained models on NLPL probing.

pdf bib
Enhancing Content Planning for Table-to-Text Generation with Data Understanding and Verification
Heng Gong | Wei Bi | Xiaocheng Feng | Bing Qin | Xiaojiang Liu | Ting Liu
Findings of the Association for Computational Linguistics: EMNLP 2020

Neural table-to-text models, which select and order salient data, as well as verbalizing them fluently via surface realization, have achieved promising progress. Based on results from previous work, the performance bottleneck of current models lies in the stage of content planing (selecting and ordering salient content from the input). That is, performance drops drastically when an oracle content plan is replaced by a model-inferred one during surface realization. In this paper, we propose to enhance neural content planning by (1) understanding data values with contextual numerical value representations that bring the sense of value comparison into content planning; (2) verifying the importance and ordering of the selected sequence of records with policy gradient. We evaluated our model on ROTOWIRE and MLB, two datasets on this task, and results show that our model outperforms existing systems with respect to content planning metrics.

pdf bib
TableGPT: Few-shot Table-to-Text Generation with Table Structure Reconstruction and Content Matching
Heng Gong | Yawei Sun | Xiaocheng Feng | Bing Qin | Wei Bi | Xiaojiang Liu | Ting Liu
Proceedings of the 28th International Conference on Computational Linguistics

Although neural table-to-text models have achieved remarkable progress with the help of large-scale datasets, they suffer insufficient learning problem with limited training data. Recently, pre-trained language models show potential in few-shot learning with linguistic knowledge learnt from pretraining on large-scale corpus. However, benefiting table-to-text generation in few-shot setting with the powerful pretrained language model faces three challenges, including (1) the gap between the task’s structured input and the natural language input for pretraining language model. (2) The lack of modeling for table structure and (3) improving text fidelity with less incorrect expressions that are contradicting to the table. To address aforementioned problems, we propose TableGPT for table-to-text generation. At first, we utilize table transformation module with template to rewrite structured table in natural language as input for GPT-2. In addition, we exploit multi-task learning with two auxiliary tasks that preserve table’s structural information by reconstructing the structure from GPT-2’s representation and improving the text’s fidelity with content matching task aligning the table and information in the generated text. By experimenting on Humans, Songs and Books, three few-shot table-to-text datasets in different domains, our model outperforms existing systems on most few-shot settings.

pdf bib
Molweni: A Challenge Multiparty Dialogues-based Machine Reading Comprehension Dataset with Discourse Structure
Jiaqi Li | Ming Liu | Min-Yen Kan | Zihao Zheng | Zekun Wang | Wenqiang Lei | Ting Liu | Bing Qin
Proceedings of the 28th International Conference on Computational Linguistics

Research into the area of multiparty dialog has grown considerably over recent years. We present the Molweni dataset, a machine reading comprehension (MRC) dataset with discourse structure built over multiparty dialog. Molweni’s source samples from the Ubuntu Chat Corpus, including 10,000 dialogs comprising 88,303 utterances. We annotate 30,066 questions on this corpus, including both answerable and unanswerable questions. Molweni also uniquely contributes discourse dependency annotations in a modified Segmented Discourse Representation Theory (SDRT; Asher et al., 2016) style for all of its multiparty dialogs, contributing large-scale (78,245 annotated discourse relations) data to bear on the task of multiparty dialog discourse parsing. Our experiments show that Molweni is a challenging dataset for current MRC models: BERT-wwm, a current, strong SQuAD 2.0 performer, achieves only 67.7% F1 on Molweni’s questions, a 20+% significant drop as compared against its SQuAD 2.0 performance.

pdf bib
An Iterative Emotion Interaction Network for Emotion Recognition in Conversations
Xin Lu | Yanyan Zhao | Yang Wu | Yijian Tian | Huipeng Chen | Bing Qin
Proceedings of the 28th International Conference on Computational Linguistics

Emotion recognition in conversations (ERC) has received much attention recently in the natural language processing community. Considering that the emotions of the utterances in conversations are interactive, previous works usually implicitly model the emotion interaction between utterances by modeling dialogue context, but the misleading emotion information from context often interferes with the emotion interaction. We noticed that the gold emotion labels of the context utterances can provide explicit and accurate emotion interaction, but it is impossible to input gold labels at inference time. To address this problem, we propose an iterative emotion interaction network, which uses iteratively predicted emotion labels instead of gold emotion labels to explicitly model the emotion interaction. This approach solves the above problem, and can effectively retain the performance advantages of explicit modeling. We conduct experiments on two datasets, and our approach achieves state-of-the-art performance.

2019

pdf bib
Multi-Input Multi-Output Sequence Labeling for Joint Extraction of Fact and Condition Tuples from Scientific Text
Tianwen Jiang | Tong Zhao | Bing Qin | Ting Liu | Nitesh Chawla | Meng Jiang
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Condition is essential in scientific statement. Without the conditions (e.g., equipment, environment) that were precisely specified, facts (e.g., observations) in the statements may no longer be valid. Existing ScienceIE methods, which aim at extracting factual tuples from scientific text, do not consider the conditions. In this work, we propose a new sequence labeling framework (as well as a new tag schema) to jointly extract the fact and condition tuples from statement sentences. The framework has (1) a multi-output module to generate one or multiple tuples and (2) a multi-input module to feed in multiple types of signals as sequences. It improves F1 score relatively by 4.2% on BioNLP2013 and by 6.2% on a new bio-text dataset for tuple extraction.

pdf bib
Cross-Lingual Machine Reading Comprehension
Yiming Cui | Wanxiang Che | Ting Liu | Bing Qin | Shijin Wang | Guoping Hu
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Though the community has made great progress on Machine Reading Comprehension (MRC) task, most of the previous works are solving English-based MRC problems, and there are few efforts on other languages mainly due to the lack of large-scale training data.In this paper, we propose Cross-Lingual Machine Reading Comprehension (CLMRC) task for the languages other than English. Firstly, we present several back-translation approaches for CLMRC task which is straightforward to adopt. However, to exactly align the answer into source language is difficult and could introduce additional noise. In this context, we propose a novel model called Dual BERT, which takes advantage of the large-scale training data provided by rich-resource language (such as English) and learn the semantic relations between the passage and question in bilingual context, and then utilize the learned knowledge to improve reading comprehension performance of low-resource language. We conduct experiments on two Chinese machine reading comprehension datasets CMRC 2018 and DRCD. The results show consistent and significant improvements over various state-of-the-art systems by a large margin, which demonstrate the potentials in CLMRC task. Resources available: https://github.com/ymcui/Cross-Lingual-MRC

pdf bib
Enhancing Neural Data-To-Text Generation Models with External Background Knowledge
Shuang Chen | Jinpeng Wang | Xiaocheng Feng | Feng Jiang | Bing Qin | Chin-Yew Lin
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Recent neural models for data-to-text generation rely on massive parallel pairs of data and text to learn the writing knowledge. They often assume that writing knowledge can be acquired from the training data alone. However, when people are writing, they not only rely on the data but also consider related knowledge. In this paper, we enhance neural data-to-text models with external knowledge in a simple but effective way to improve the fidelity of generated text. Besides relying on parallel data and text as in previous work, our model attends to relevant external knowledge, encoded as a temporary memory, and combines this knowledge with the context representation of data before generating words. This allows the model to infer relevant facts which are not explicitly stated in the data table from an external knowledge source. Experimental results on twenty-one Wikipedia infobox-to-text datasets show our model, KBAtt, consistently improves a state-of-the-art model on most of the datasets. In addition, to quantify when and why external knowledge is effective, we design a metric, KBGain, which shows a strong correlation with the observed performance boost. This result demonstrates the relevance of external knowledge and sparseness of original data are the main factors affecting system performance.

pdf bib
Table-to-Text Generation with Effective Hierarchical Encoder on Three Dimensions (Row, Column and Time)
Heng Gong | Xiaocheng Feng | Bing Qin | Ting Liu
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Although Seq2Seq models for table-to-text generation have achieved remarkable progress, modeling table representation in one dimension is inadequate. This is because (1) the table consists of multiple rows and columns, which means that encoding a table should not depend only on one dimensional sequence or set of records and (2) most of the tables are time series data (e.g. NBA game data, stock market data), which means that the description of the current table may be affected by its historical data. To address aforementioned problems, not only do we model each table cell considering other records in the same row, we also enrich table’s representation by modeling each table cell in context of other cells in the same column or with historical (time dimension) data respectively. In addition, we develop a table cell fusion gate to combine representations from row, column and time dimension into one dense vector according to the saliency of each dimension’s representation. We evaluated our methods on ROTOWIRE, a benchmark dataset of NBA basketball games. Both automatic and human evaluation results demonstrate the effectiveness of our model with improvement of 2.66 in BLEU over the strong baseline and outperformance of state-of-the-art model.

pdf bib
Learning to Ask Unanswerable Questions for Machine Reading Comprehension
Haichao Zhu | Li Dong | Furu Wei | Wenhui Wang | Bing Qin | Ting Liu
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Machine reading comprehension with unanswerable questions is a challenging task. In this work, we propose a data augmentation technique by automatically generating relevant unanswerable questions according to an answerable question paired with its corresponding paragraph that contains the answer. We introduce a pair-to-sequence model for unanswerable question generation, which effectively captures the interactions between the question and the paragraph. We also present a way to construct training data for our question generation models by leveraging the existing reading comprehension dataset. Experimental results show that the pair-to-sequence model performs consistently better compared with the sequence-to-sequence baseline. We further use the automatically generated unanswerable questions as a means of data augmentation on the SQuAD 2.0 dataset, yielding 1.9 absolute F1 improvement with BERT-base model and 1.7 absolute F1 improvement with BERT-large model.

2018

pdf bib
Parsing Tweets into Universal Dependencies
Yijia Liu | Yi Zhu | Wanxiang Che | Bing Qin | Nathan Schneider | Noah A. Smith
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)

We study the problem of analyzing tweets with universal dependencies (UD). We extend the UD guidelines to cover special constructions in tweets that affect tokenization, part-of-speech tagging, and labeled dependencies. Using the extended guidelines, we create a new tweet treebank for English (Tweebank v2) that is four times larger than the (unlabeled) Tweebank v1 introduced by Kong et al. (2014). We characterize the disagreements between our annotators and show that it is challenging to deliver consistent annotation due to ambiguity in understanding and explaining tweets. Nonetheless, using the new treebank, we build a pipeline system to parse raw tweets into UD. To overcome the annotation noise without sacrificing computational efficiency, we propose a new method to distill an ensemble of 20 transition-based parsers into a single one. Our parser achieves an improvement of 2.2 in LAS over the un-ensembled baseline and outperforms parsers that are state-of-the-art on other treebanks in both accuracy and speed.

pdf bib
Adaptive Multi-pass Decoder for Neural Machine Translation
Xinwei Geng | Xiaocheng Feng | Bing Qin | Ting Liu
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Although end-to-end neural machine translation (NMT) has achieved remarkable progress in the recent years, the idea of adopting multi-pass decoding mechanism into conventional NMT is not well explored. In this paper, we propose a novel architecture called adaptive multi-pass decoder, which introduces a flexible multi-pass polishing mechanism to extend the capacity of NMT via reinforcement learning. More specifically, we adopt an extra policy network to automatically choose a suitable and effective number of decoding passes, according to the complexity of source sentences and the quality of the generated translations. Extensive experiments on Chinese-English translation demonstrate the effectiveness of our proposed adaptive multi-pass decoder upon the conventional NMT with a significant improvement about 1.55 BLEU.

pdf bib
An AMR Aligner Tuned by Transition-based Parser
Yijia Liu | Wanxiang Che | Bo Zheng | Bing Qin | Ting Liu
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

In this paper, we propose a new rich resource enhanced AMR aligner which produces multiple alignments and a new transition system for AMR parsing along with its oracle parser. Our aligner is further tuned by our oracle parser via picking the alignment that leads to the highest-scored achievable AMR graph. Experimental results show that our aligner outperforms the rule-based aligner in previous work by achieving higher alignment F1 score and consistently improving two open-sourced AMR parsers. Based on our aligner and transition system, we develop a transition-based AMR parser that parses a sentence into its AMR graph directly. An ensemble of our parsers with only words and POS tags as input leads to 68.4 Smatch F1 score, which outperforms the current state-of-the-art parser.

pdf bib
Semantic Parsing with Syntax- and Table-Aware SQL Generation
Yibo Sun | Duyu Tang | Nan Duan | Jianshu Ji | Guihong Cao | Xiaocheng Feng | Bing Qin | Ting Liu | Ming Zhou
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

We present a generative model to map natural language questions into SQL queries. Existing neural network based approaches typically generate a SQL query word-by-word, however, a large portion of the generated results is incorrect or not executable due to the mismatch between question words and table contents. Our approach addresses this problem by considering the structure of table and the syntax of SQL language. The quality of the generated SQL query is significantly improved through (1) learning to replicate content from column names, cells or SQL keywords; and (2) improving the generation of WHERE clause by leveraging the column-cell relation. Experiments are conducted on WikiSQL, a recently released dataset with the largest question- SQL pairs. Our approach significantly improves the state-of-the-art execution accuracy from 69.0% to 74.4%.

pdf bib
Distilling Knowledge for Search-based Structured Prediction
Yijia Liu | Wanxiang Che | Huaipeng Zhao | Bing Qin | Ting Liu
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Many natural language processing tasks can be modeled into structured prediction and solved as a search problem. In this paper, we distill an ensemble of multiple models trained with different initialization into a single model. In addition to learning to match the ensemble’s probability output on the reference states, we also use the ensemble to explore the search space and learn from the encountered states in the exploration. Experimental results on two typical search-based structured prediction tasks – transition-based dependency parsing and neural machine translation show that distillation can effectively improve the single model’s performance and the final model achieves improvements of 1.32 in LAS and 2.65 in BLEU score on these two tasks respectively over strong baselines and it outperforms the greedy structured prediction models in previous literatures.

2017

pdf bib
Benben: A Chinese Intelligent Conversational Robot
Wei-Nan Zhang | Ting Liu | Bing Qin | Yu Zhang | Wanxiang Che | Yanyan Zhao | Xiao Ding
Proceedings of ACL 2017, System Demonstrations

2016

pdf bib
Aspect Level Sentiment Classification with Deep Memory Network
Duyu Tang | Bing Qin | Ting Liu
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf bib
English-Chinese Knowledge Base Translation with Neural Network
Xiaocheng Feng | Duyu Tang | Bing Qin | Ting Liu
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Knowledge base (KB) such as Freebase plays an important role for many natural language processing tasks. English knowledge base is obviously larger and of higher quality than low resource language like Chinese. To expand Chinese KB by leveraging English KB resources, an effective way is to translate English KB (source) into Chinese (target). In this direction, two major challenges are to model triple semantics and to build a robust KB translator. We address these challenges by presenting a neural network approach, which learns continuous triple representation with a gated neural network. Accordingly, source triples and target triples are mapped in the same semantic vector space. We build a new dataset for English-Chinese KB translation from Freebase, and compare with several baselines on it. Experimental results show that the proposed method improves translation accuracy compared with baseline methods. We show that adaptive composition model improves standard solution such as neural tensor network in terms of translation accuracy.

pdf bib
Effective LSTMs for Target-Dependent Sentiment Classification
Duyu Tang | Bing Qin | Xiaocheng Feng | Ting Liu
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Target-dependent sentiment classification remains a challenge: modeling the semantic relatedness of a target with its context words in a sentence. Different context words have different influences on determining the sentiment polarity of a sentence towards the target. Therefore, it is desirable to integrate the connections between target word and context words when building a learning system. In this paper, we develop two target dependent long short-term memory (LSTM) models, where target information is automatically taken into account. We evaluate our methods on a benchmark dataset from Twitter. Empirical results show that modeling sentence representation with standard LSTM does not perform well. Incorporating target information into LSTM can significantly boost the classification accuracy. The target-dependent LSTM models achieve state-of-the-art performances without using syntactic parser or external sentiment lexicons.

pdf bib
A Language-Independent Neural Network for Event Detection
Xiaocheng Feng | Lifu Huang | Duyu Tang | Heng Ji | Bing Qin | Ting Liu
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
SemEval-2016 Task 5: Aspect Based Sentiment Analysis
Maria Pontiki | Dimitris Galanis | Haris Papageorgiou | Ion Androutsopoulos | Suresh Manandhar | Mohammad AL-Smadi | Mahmoud Al-Ayyoub | Yanyan Zhao | Bing Qin | Orphée De Clercq | Véronique Hoste | Marianna Apidianaki | Xavier Tannier | Natalia Loukachevitch | Evgeniy Kotelnikov | Nuria Bel | Salud María Jiménez-Zafra | Gülşen Eryiğit
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)

2015

pdf bib
Document Modeling with Gated Recurrent Neural Network for Sentiment Classification
Duyu Tang | Bing Qin | Ting Liu
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
Transition-Based Syntactic Linearization
Yijia Liu | Yue Zhang | Wanxiang Che | Bing Qin
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Encoding World Knowledge in the Evaluation of Local Coherence
Muyu Zhang | Vanessa Wei Feng | Bing Qin | Graeme Hirst | Ting Liu | Jingwen Huang
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Encoding Distributional Semantics into Triple-Based Knowledge Ranking for Document Enrichment
Muyu Zhang | Bing Qin | Mao Zheng | Graeme Hirst | Ting Liu
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

pdf bib
Learning Semantic Representations of Users and Products for Document Level Sentiment Classification
Duyu Tang | Bing Qin | Ting Liu
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

2014

pdf bib
Coooolll: A Deep Learning System for Twitter Sentiment Classification
Duyu Tang | Furu Wei | Bing Qin | Ting Liu | Ming Zhou
Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014)

pdf bib
Learning Semantic Hierarchies via Word Embeddings
Ruiji Fu | Jiang Guo | Bing Qin | Wanxiang Che | Haifeng Wang | Ting Liu
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Learning Sentiment-Specific Word Embedding for Twitter Sentiment Classification
Duyu Tang | Furu Wei | Nan Yang | Ming Zhou | Ting Liu | Bing Qin
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Building Large-Scale Twitter-Specific Sentiment Lexicon : A Representation Learning Approach
Duyu Tang | Furu Wei | Bing Qin | Ming Zhou | Ting Liu
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

pdf bib
Triple based Background Knowledge Ranking for Document Enrichment
Muyu Zhang | Bing Qin | Ting Liu | Mao Zheng
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

pdf bib
Sentence Compression for Target-Polarity Word Collocation Extraction
Yanyan Zhao | Wanxiang Che | Honglei Guo | Bing Qin | Zhong Su | Ting Liu
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

pdf bib
A Joint Segmentation and Classification Framework for Sentiment Analysis
Duyu Tang | Furu Wei | Bing Qin | Li Dong | Ting Liu | Ming Zhou
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

2013

pdf bib
Improving Web Search Ranking by Incorporating Structured Annotation of Queries
Xiao Ding | Zhicheng Dou | Bing Qin | Ting Liu | Ji-Rong Wen
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

pdf bib
Microblog Entity Linking by Leveraging Extra Posts
Yuhang Guo | Bing Qin | Ting Liu | Sheng Li
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

pdf bib
Exploiting Multiple Sources for Open-Domain Hypernym Discovery
Ruiji Fu | Bing Qin | Ting Liu
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

pdf bib
Building Chinese Event Type Paradigm Based on Trigger Clustering
Xiao Ding | Bing Qin | Ting Liu
Proceedings of the Sixth International Joint Conference on Natural Language Processing

pdf bib
Topical Key Concept Extraction from Folksonomy
Han Xue | Bing Qin | Ting Liu | Chao Xiang
Proceedings of the Sixth International Joint Conference on Natural Language Processing

2012

pdf bib
Collocation Polarity Disambiguation Using Web-based Pseudo Contexts
Yanyan Zhao | Bing Qin | Ting Liu
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

2011

pdf bib
Coreference Resolution System using Maximum Entropy Classifier
Weipeng Chen | Muyu Zhang | Bing Qin
Proceedings of the Fifteenth Conference on Computational Natural Language Learning: Shared Task

pdf bib
Generating Chinese Named Entity Data from a Parallel Corpus
Ruiji Fu | Bing Qin | Ting Liu
Proceedings of 5th International Joint Conference on Natural Language Processing

2010

pdf bib
Generalizing Syntactic Structures for Product Attribute Candidate Extraction
Yanyan Zhao | Bing Qin | Shen Hu | Ting Liu
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics

2009

pdf bib
Multilingual Dependency-based Syntactic and Semantic Parsing
Wanxiang Che | Zhenghua Li | Yongqiang Li | Yuhang Guo | Bing Qin | Ting Liu
Proceedings of the Thirteenth Conference on Computational Natural Language Learning (CoNLL 2009): Shared Task

2008

pdf bib
A Cascaded Syntactic and Semantic Dependency Parsing System
Wanxiang Che | Zhenghua Li | Yuxuan Hu | Yongqiang Li | Bing Qin | Ting Liu | Sheng Li
CoNLL 2008: Proceedings of the Twelfth Conference on Computational Natural Language Learning