Houfeng Wang


2020

pdf bib
Syntax-Aware Graph Attention Network for Aspect-Level Sentiment Classification
Lianzhe Huang | Xin Sun | Sujian Li | Linhao Zhang | Houfeng Wang
Proceedings of the 28th International Conference on Computational Linguistics

Aspect-level sentiment classification aims to distinguish the sentiment polarities over aspect terms in a sentence. Existing approaches mostly focus on modeling the relationship between the given aspect words and their contexts with attention, and ignore the use of more elaborate knowledge implicit in the context. In this paper, we exploit syntactic awareness to the model by the graph attention network on the dependency tree structure and external pre-training knowledge by BERT language model, which helps to model the interaction between the context and aspect words better. And the subwords of BERT are integrated into the dependency tree graphs, which can obtain more accurate representations of words by graph attention. Experiments demonstrate the effectiveness of our model.

2019

pdf bib
Text Level Graph Neural Network for Text Classification
Lianzhe Huang | Dehong Ma | Sujian Li | Xiaodong Zhang | Houfeng Wang
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Recently, researches have explored the graph neural network (GNN) techniques on text classification, since GNN does well in handling complex structures and preserving global information. However, previous methods based on GNN are mainly faced with the practical problems of fixed corpus level graph structure which don’t support online testing and high memory consumption. To tackle the problems, we propose a new GNN based model that builds graphs for each input text with global parameters sharing instead of a single graph for the whole corpus. This method removes the burden of dependence between an individual text and entire corpus which support online testing, but still preserve global information. Besides, we build graphs by much smaller windows in the text, which not only extract more local features but also significantly reduce the edge numbers as well as memory consumption. Experiments show that our model outperforms existing models on several text classification datasets even with consuming less memory.

pdf bib
Exploring Sequence-to-Sequence Learning in Aspect Term Extraction
Dehong Ma | Sujian Li | Fangzhao Wu | Xing Xie | Houfeng Wang
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Aspect term extraction (ATE) aims at identifying all aspect terms in a sentence and is usually modeled as a sequence labeling problem. However, sequence labeling based methods cannot make full use of the overall meaning of the whole sentence and have the limitation in processing dependencies between labels. To tackle these problems, we first explore to formalize ATE as a sequence-to-sequence (Seq2Seq) learning task where the source sequence and target sequence are composed of words and labels respectively. At the same time, to make Seq2Seq learning suit to ATE where labels correspond to words one by one, we design the gated unit networks to incorporate corresponding word representation into the decoder, and position-aware attention to pay more attention to the adjacent words of a target word. The experimental results on two datasets show that Seq2Seq learning is effective in ATE accompanied with our proposed gated unit networks and position-aware attention mechanism.

2018

pdf bib
simNet: Stepwise Image-Topic Merging Network for Generating Detailed and Comprehensive Image Captions
Fenglin Liu | Xuancheng Ren | Yuanxin Liu | Houfeng Wang | Xu Sun
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

The encode-decoder framework has shown recent success in image captioning. Visual attention, which is good at detailedness, and semantic attention, which is good at comprehensiveness, have been separately proposed to ground the caption on the image. In this paper, we propose the Stepwise Image-Topic Merging Network (simNet) that makes use of the two kinds of attention at the same time. At each time step when generating the caption, the decoder adaptively merges the attentive information in the extracted topics and the image according to the generated context, so that the visual information and the semantic information can be effectively combined. The proposed approach is evaluated on two benchmark datasets and reaches the state-of-the-art performances.

pdf bib
Auto-Dialabel: Labeling Dialogue Data with Unsupervised Learning
Chen Shi | Qi Chen | Lei Sha | Sujian Li | Xu Sun | Houfeng Wang | Lintao Zhang
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

The lack of labeled data is one of the main challenges when building a task-oriented dialogue system. Existing dialogue datasets usually rely on human labeling, which is expensive, limited in size, and in low coverage. In this paper, we instead propose our framework auto-dialabel to automatically cluster the dialogue intents and slots. In this framework, we collect a set of context features, leverage an autoencoder for feature assembly, and adapt a dynamic hierarchical clustering method for intent and slot labeling. Experimental results show that our framework can promote human labeling cost to a great extent, achieve good intent clustering accuracy (84.1%), and provide reasonable and instructive slot labeling results.

pdf bib
Phrase-level Self-Attention Networks for Universal Sentence Encoding
Wei Wu | Houfeng Wang | Tianyu Liu | Shuming Ma
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Universal sentence encoding is a hot topic in recent NLP research. Attention mechanism has been an integral part in many sentence encoding models, allowing the models to capture context dependencies regardless of the distance between the elements in the sequence. Fully attention-based models have recently attracted enormous interest due to their highly parallelizable computation and significantly less training time. However, the memory consumption of their models grows quadratically with the sentence length, and the syntactic information is neglected. To this end, we propose Phrase-level Self-Attention Networks (PSAN) that perform self-attention across words inside a phrase to capture context dependencies at the phrase level, and use the gated memory updating mechanism to refine each word’s representation hierarchically with longer-term context dependencies captured in a larger phrase. As a result, the memory consumption can be reduced because the self-attention is performed at the phrase level instead of the sentence level. At the same time, syntactic information can be easily integrated in the model. Experiment results show that PSAN can achieve the state-of-the-art performance across a plethora of NLP tasks including binary and multi-class classification, natural language inference and sentence similarity.

pdf bib
Joint Learning for Targeted Sentiment Analysis
Dehong Ma | Sujian Li | Houfeng Wang
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Targeted sentiment analysis (TSA) aims at extracting targets and classifying their sentiment classes. Previous works only exploit word embeddings as features and do not explore more potentials of neural networks when jointly learning the two tasks. In this paper, we carefully design the hierarchical stack bidirectional gated recurrent units (HSBi-GRU) model to learn abstract features for both tasks, and we propose a HSBi-GRU based joint model which allows the target label to have influence on their sentiment label. Experimental results on two datasets show that our joint learning model can outperform other baselines and demonstrate the effectiveness of HSBi-GRU in learning abstract features.

pdf bib
A Neural Question Answering Model Based on Semi-Structured Tables
Hao Wang | Xiaodong Zhang | Shuming Ma | Xu Sun | Houfeng Wang | Mengxiang Wang
Proceedings of the 27th International Conference on Computational Linguistics

Most question answering (QA) systems are based on raw text and structured knowledge graph. However, raw text corpora are hard for QA system to understand, and structured knowledge graph needs intensive manual work, while it is relatively easy to obtain semi-structured tables from many sources directly, or build them automatically. In this paper, we build an end-to-end system to answer multiple choice questions with semi-structured tables as its knowledge. Our system answers queries by two steps. First, it finds the most similar tables. Then the system measures the relevance between each question and candidate table cells, and choose the most related cell as the source of answer. The system is evaluated with TabMCQ dataset, and gets a huge improvement compared to the state of the art.

pdf bib
SGM: Sequence Generation Model for Multi-label Classification
Pengcheng Yang | Xu Sun | Wei Li | Shuming Ma | Wei Wu | Houfeng Wang
Proceedings of the 27th International Conference on Computational Linguistics

Multi-label classification is an important yet challenging task in natural language processing. It is more complex than single-label classification in that the labels tend to be correlated. Existing methods tend to ignore the correlations between labels. Besides, different parts of the text can contribute differently for predicting different labels, which is not considered by existing models. In this paper, we propose to view the multi-label classification task as a sequence generation problem, and apply a sequence generation model with a novel decoder structure to solve it. Extensive experimental results show that our proposed methods outperform previous work by a substantial margin. Further analysis of experimental results demonstrates that the proposed methods not only capture the correlations between labels, but also select the most informative words automatically when predicting different labels.

pdf bib
Unpaired Sentiment-to-Sentiment Translation: A Cycled Reinforcement Learning Approach
Jingjing Xu | Xu Sun | Qi Zeng | Xiaodong Zhang | Xuancheng Ren | Houfeng Wang | Wenjie Li
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

The goal of sentiment-to-sentiment “translation” is to change the underlying sentiment of a sentence while keeping its content. The main challenge is the lack of parallel data. To solve this problem, we propose a cycled reinforcement learning method that enables training on unpaired data by collaboration between a neutralization module and an emotionalization module. We evaluate our approach on two review datasets, Yelp and Amazon. Experimental results show that our approach significantly outperforms the state-of-the-art systems. Especially, the proposed method substantially improves the content preservation performance. The BLEU score is improved from 1.64 to 22.46 and from 0.56 to 14.06 on the two datasets, respectively.

pdf bib
Question Condensing Networks for Answer Selection in Community Question Answering
Wei Wu | Xu Sun | Houfeng Wang
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Answer selection is an important subtask of community question answering (CQA). In a real-world CQA forum, a question is often represented as two parts: a subject that summarizes the main points of the question, and a body that elaborates on the subject in detail. Previous researches on answer selection usually ignored the difference between these two parts and concatenated them as the question representation. In this paper, we propose the Question Condensing Networks (QCN) to make use of the subject-body relationship of community questions. In our model, the question subject is the primary part of the question representation, and the question body information is aggregated based on similarity and disparity with the question subject. Experimental results show that QCN outperforms all existing models on two CQA datasets.

pdf bib
Autoencoder as Assistant Supervisor: Improving Text Representation for Chinese Social Media Text Summarization
Shuming Ma | Xu Sun | Junyang Lin | Houfeng Wang
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Most of the current abstractive text summarization models are based on the sequence-to-sequence model (Seq2Seq). The source content of social media is long and noisy, so it is difficult for Seq2Seq to learn an accurate semantic representation. Compared with the source content, the annotated summary is short and well written. Moreover, it shares the same meaning as the source content. In this work, we supervise the learning of the representation of the source content with that of the summary. In implementation, we regard a summary autoencoder as an assistant supervisor of Seq2Seq. Following previous work, we evaluate our model on a popular Chinese social media dataset. Experimental results show that our model achieves the state-of-the-art performances on the benchmark dataset.

2017

pdf bib
A Two-Stage Parsing Method for Text-Level Discourse Analysis
Yizhong Wang | Sujian Li | Houfeng Wang
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Previous work introduced transition-based algorithms to form a unified architecture of parsing rhetorical structures (including span, nuclearity and relation), but did not achieve satisfactory performance. In this paper, we propose that transition-based model is more appropriate for parsing the naked discourse tree (i.e., identifying span and nuclearity) due to data sparsity. At the same time, we argue that relation labeling can benefit from naked tree structure and should be treated elaborately with consideration of three kinds of relations including within-sentence, across-sentence and across-paragraph relations. Thus, we design a pipelined two-stage parsing method for generating an RST tree from text. Experimental results show that our method achieves state-of-the-art performance, especially on span and nuclearity identification.

pdf bib
Improving Semantic Relevance for Sequence-to-Sequence Learning of Chinese Social Media Text Summarization
Shuming Ma | Xu Sun | Jingjing Xu | Houfeng Wang | Wenjie Li | Qi Su
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Current Chinese social media text summarization models are based on an encoder-decoder framework. Although its generated summaries are similar to source texts literally, they have low semantic relevance. In this work, our goal is to improve semantic relevance between source texts and summaries for Chinese social media summarization. We introduce a Semantic Relevance Based neural model to encourage high semantic similarity between texts and summaries. In our model, the source text is represented by a gated attention encoder, while the summary representation is produced by a decoder. Besides, the similarity score between the representations is maximized during training. Our experiments show that the proposed model outperforms baseline systems on a social media corpus.

pdf bib
Addressing Domain Adaptation for Chinese Word Segmentation with Global Recurrent Structure
Shen Huang | Xu Sun | Houfeng Wang
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Boundary features are widely used in traditional Chinese Word Segmentation (CWS) methods as they can utilize unlabeled data to help improve the Out-of-Vocabulary (OOV) word recognition performance. Although various neural network methods for CWS have achieved performance competitive with state-of-the-art systems, these methods, constrained by the domain and size of the training corpus, do not work well in domain adaptation. In this paper, we propose a novel BLSTM-based neural network model which incorporates a global recurrent structure designed for modeling boundary features dynamically. Experiments show that the proposed structure can effectively boost the performance of Chinese Word Segmentation, especially OOV-Recall, which brings benefits to domain adaptation. We achieved state-of-the-art results on 6 domains of CNKI articles, and competitive results to the best reported on the 4 domains of SIGHAN Bakeoff 2010 data.

pdf bib
Tag-Enhanced Tree-Structured Neural Networks for Implicit Discourse Relation Classification
Yizhong Wang | Sujian Li | Jingfeng Yang | Xu Sun | Houfeng Wang
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Identifying implicit discourse relations between text spans is a challenging task because it requires understanding the meaning of the text. To tackle this task, recent studies have tried several deep learning methods but few of them exploited the syntactic information. In this work, we explore the idea of incorporating syntactic parse tree into neural networks. Specifically, we employ the Tree-LSTM model and Tree-GRU model, which is based on the tree structure, to encode the arguments in a relation. And we further leverage the constituent tags to control the semantic composition process in these tree-structured neural networks. Experimental results show that our method achieves state-of-the-art performance on PDTB corpus.

pdf bib
Cascading Multiway Attentions for Document-level Sentiment Classification
Dehong Ma | Sujian Li | Xiaodong Zhang | Houfeng Wang | Xu Sun
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Document-level sentiment classification aims to assign the user reviews a sentiment polarity. Previous methods either just utilized the document content without consideration of user and product information, or did not comprehensively consider what roles the three kinds of information play in text modeling. In this paper, to reasonably use all the information, we present the idea that user, product and their combination can all influence the generation of attentions to words and sentences, when judging the sentiment of a document. With this idea, we propose a cascading multiway attention (CMA) model, where multiple ways of using user and product information are cascaded to influence the generation of attentions on the word and sentence layers. Then, sentences and documents are well modeled by multiple representation vectors, which provide rich information for sentiment classification. Experiments on IMDB and Yelp datasets demonstrate the effectiveness of our model.

pdf bib
Learning to Rank Semantic Coherence for Topic Segmentation
Liang Wang | Sujian Li | Yajuan Lv | Houfeng Wang
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

Topic segmentation plays an important role for discourse parsing and information retrieval. Due to the absence of training data, previous work mainly adopts unsupervised methods to rank semantic coherence between paragraphs for topic segmentation. In this paper, we present an intuitive and simple idea to automatically create a “quasi” training dataset, which includes a large amount of text pairs from the same or different documents with different semantic coherence. With the training corpus, we design a symmetric CNN neural network to model text pairs and rank the semantic coherence within the learning to rank framework. Experiments show that our algorithm is able to achieve competitive performance over strong baselines on several real-world datasets.

pdf bib
Noise-Clustered Distant Supervision for Relation Extraction: A Nonparametric Bayesian Perspective
Qing Zhang | Houfeng Wang
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

For the task of relation extraction, distant supervision is an efficient approach to generate labeled data by aligning knowledge base with free texts. The essence of it is a challenging incomplete multi-label classification problem with sparse and noisy features. To address the challenge, this work presents a novel nonparametric Bayesian formulation for the task. Experiment results show substantially higher top precision improvements over the traditional state-of-the-art approaches.

2016

pdf bib
Bidirectional Recurrent Convolutional Neural Network for Relation Classification
Rui Cai | Xiaodong Zhang | Houfeng Wang
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Knowledge-Based Semantic Embedding for Machine Translation
Chen Shi | Shujie Liu | Shuo Ren | Shi Feng | Mu Li | Ming Zhou | Xu Sun | Houfeng Wang
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Proceedings of the 2nd Workshop on Semantics-Driven Machine Translation (SedMT 2016)
Deyi Xiong | Kevin Duh | Eneko Agirre | Nora Aranberri | Houfeng Wang
Proceedings of the 2nd Workshop on Semantics-Driven Machine Translation (SedMT 2016)

pdf bib
Bi-LSTM Neural Networks for Chinese Grammatical Error Diagnosis
Shen Huang | Houfeng Wang
Proceedings of the 3rd Workshop on Natural Language Processing Techniques for Educational Applications (NLPTEA2016)

Grammatical Error Diagnosis for Chinese has always been a challenge for both foreign learners and NLP researchers, for the variousity of grammar and the flexibility of expression. In this paper, we present a model based on Bidirectional Long Short-Term Memory(Bi-LSTM) neural networks, which treats the task as a sequence labeling problem, so as to detect Chinese grammatical errors, to identify the error types and to locate the error positions. In the corpora of this year’s shared task, there can be multiple errors in a single offset of a sentence, to address which, we simutaneously train three Bi-LSTM models sharing word embeddings which label Missing, Redundant and Selection errors respectively. We regard word ordering error as a special kind of word selection error which is longer during training phase, and then separate them by length during testing phase. In NLP-TEA 3 shared task for Chinese Grammatical Error Diagnosis(CGED), Our system achieved relatively high F1 for all the three levels in the traditional Chinese track and for the detection level in the Simpified Chinese track.

2015

pdf bib
Multi-label Text Categorization with Joint Learning Predictions-as-Features Method
Li Li | Houfeng Wang | Xu Sun | Baobao Chang | Shi Zhao | Lei Sha
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
A Dependency-Based Neural Network for Relation Classification
Yang Liu | Furu Wei | Sujian Li | Heng Ji | Ming Zhou | Houfeng Wang
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

pdf bib
Learning Summary Prior Representation for Extractive Summarization
Ziqiang Cao | Furu Wei | Sujian Li | Wenjie Li | Ming Zhou | Houfeng Wang
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

2014

pdf bib
Feature-Frequency–Adaptive On-line Training for Fast and Accurate Natural Language Processing
Xu Sun | Wenjie Li | Houfeng Wang | Qin Lu
Computational Linguistics, Volume 40, Issue 3 - September 2014

pdf bib
A Unified Framework for Grammar Error Correction
Longkai Zhang | Houfeng Wang
Proceedings of the Eighteenth Conference on Computational Natural Language Learning: Shared Task

pdf bib
Collaborative Topic Regression with Multiple Graphs Factorization for Recommendation in Social Media
Qing Zhang | Houfeng Wang
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

pdf bib
Multi-view Chinese Treebanking
Likun Qiu | Yue Zhang | Peng Jin | Houfeng Wang
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

pdf bib
Go Climb a Dependency Tree and Correct the Grammatical Errors
Longkai Zhang | Houfeng Wang
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

pdf bib
Predicting Chinese Abbreviations with Minimum Semantic Unit and Global Constraints
Longkai Zhang | Li Li | Houfeng Wang | Xu Sun
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

pdf bib
Muli-label Text Categorization with Hidden Components
Li Li | Longkai Zhang | Houfeng Wang
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

pdf bib
Coarse-grained Candidate Generation and Fine-grained Re-ranking for Chinese Abbreviation Prediction
Longkai Zhang | Houfeng Wang | Xu Sun
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

2013

pdf bib
Exploring Representations from Unlabeled Data with Co-training for Chinese Word Segmentation
Longkai Zhang | Houfeng Wang | Xu Sun | Mairgup Mansur
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

pdf bib
Efficient Collective Entity Linking with Stacking
Zhengyan He | Shujie Liu | Yang Song | Mu Li | Ming Zhou | Houfeng Wang
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

pdf bib
Generalized Abbreviation Prediction with Negative Full Forms and Its Application on Improving Chinese Web Search
Xu Sun | Wenjie Li | Fanqi Meng | Houfeng Wang
Proceedings of the Sixth International Joint Conference on Natural Language Processing

pdf bib
Learning Entity Representation for Entity Disambiguation
Zhengyan He | Shujie Liu | Mu Li | Ming Zhou | Longkai Zhang | Houfeng Wang
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
Improving Chinese Word Segmentation on Micro-blog Using Rich Punctuations
Longkai Zhang | Li Li | Zhengyan He | Houfeng Wang | Ni Sun
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

2012

pdf bib
Fast Online Training with Frequency-Adaptive Learning Rates for Chinese Word Segmentation and New Word Detection
Xu Sun | Houfeng Wang | Wenjie Li
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Cross-Lingual Mixture Model for Sentiment Classification
Xinfan Meng | Furu Wei | Xiaohua Liu | Ming Zhou | Ge Xu | Houfeng Wang
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
The Task 2 of CIPS-SIGHAN 2012 Named Entity Recognition and Disambiguation in Chinese Bakeoff
Zhengyan He | Houfeng Wang | Sujian Li
Proceedings of the Second CIPS-SIGHAN Joint Conference on Chinese Language Processing

pdf bib
A Comparison and Improvement of Online Learning Algorithms for Sequence Labeling
Zhengyan He | Houfeng Wang
Proceedings of COLING 2012

pdf bib
Constructing Chinese Abbreviation Dictionary: A Stacked Approach
Longkai Zhang | Sujian Li | Houfeng Wang | Ni Sun | Xinfan Meng
Proceedings of COLING 2012

pdf bib
Lost in Translations? Building Sentiment Lexicons using Context Based Machine Translation
Xinfan Meng | Furu Wei | Ge Xu | Longkai Zhang | Xiaohua Liu | Ming Zhou | Houfeng Wang
Proceedings of COLING 2012: Posters

pdf bib
Joint Learning for Coreference Resolution with Markov Logic
Yang Song | Jing Jiang | Wayne Xin Zhao | Sujian Li | Houfeng Wang
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

2011

pdf bib
Link Type Based Pre-Cluster Pair Model for Coreference Resolution
Yang Song | Houfeng Wang | Jing Jiang
Proceedings of the Fifteenth Conference on Computational Natural Language Learning: Shared Task

2010

pdf bib
A Pipeline Approach to Chinese Personal Name Disambiguation
Yang Song | Zhengyan He | Chen Chen | Houfeng Wang
CIPS-SIGHAN Joint Conference on Chinese Language Processing

pdf bib
Applying Spectral Clustering for Chinese Word Sense Induction
Zhengyan He | Yang Song | Houfeng Wang
CIPS-SIGHAN Joint Conference on Chinese Language Processing

pdf bib
Build Chinese Emotion Lexicons Using A Graph-based Algorithm and Multiple Resources
Ge Xu | Xinfan Meng | Houfeng Wang
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)

2009

pdf bib
Mining User Reviews: from Specification to Summarization
Xinfan Meng | Houfeng Wang
Proceedings of the ACL-IJCNLP 2009 Conference Short Papers

pdf bib
Clustering Technique in Multi-Document Personal Name Disambiguation
Chen Chen | Junfeng Hu | Houfeng Wang
Proceedings of the ACL-IJCNLP 2009 Student Research Workshop

2008

pdf bib
Bootstrapping Both Product Features and Opinion Words from Chinese Customer Reviews with Cross-Inducing
Bo Wang | Houfeng Wang
Proceedings of the Third International Joint Conference on Natural Language Processing: Volume-I

pdf bib
Chinese Named Entity Recognition and Word Segmentation Based on Character
Jingzhou He | Houfeng Wang
Proceedings of the Sixth SIGHAN Workshop on Chinese Language Processing

2003

pdf bib
News-Oriented Automatic Chinese Keyword Indexing
Sujian Li | Houfeng Wang | Shiwen Yu | Chengsheng Xin
Proceedings of the Second SIGHAN Workshop on Chinese Language Processing

pdf bib
News-Oriented Keyword Indexing with Maximum Entropy Principle
Sujian Li | Houfeng Wang | Shiwen Yu | Chengsheng Xin
Proceedings of the 17th Pacific Asia Conference on Language, Information and Computation