Arjun Mukherjee


2020

pdf bib
Stance Prediction for Contemporary Issues: Data and Experiments
Marjan Hosseinia | Eduard Dragut | Arjun Mukherjee
Proceedings of the Eighth International Workshop on Natural Language Processing for Social Media

We investigate whether pre-trained bidirectional transformers with sentiment and emotion information improve stance detection in long discussions of contemporary issues. As a part of this work, we create a novel stance detection dataset covering 419 different controversial issues and their related pros and cons collected by procon.org in nonpartisan format. Experimental results show that a shallow recurrent neural network with sentiment or emotion information can reach competitive results compared to fine-tuned BERT with 20x fewer parameters. We also use a simple approach that explains which input phrases contribute to stance detection.

pdf bib
Predicting Personal Opinion on Future Events with Fingerprints
Fan Yang | Eduard Dragut | Arjun Mukherjee
Proceedings of the 28th International Conference on Computational Linguistics

Predicting users’ opinions in their response to social events has important real-world applications, many of which political and social impacts. Existing approaches derive a population’s opinion on a going event from large scores of user generated content. In certain scenarios, we may not be able to acquire such content and thus cannot infer an unbiased opinion on those emerging events. To address this problem, we propose to explore opinion on unseen articles based on one’s fingerprinting: the prior reading and commenting history. This work presents a focused study on modeling and leveraging fingerprinting techniques to predict a user’s future opinion. We introduce a recurrent neural network based model that integrates fingerprinting. We collect a large dataset that consists of event-comment pairs from six news websites. We evaluate the proposed model on this dataset. The results show substantial performance gains demonstrating the effectiveness of our approach.

2018

pdf bib
Experiments with Convolutional Neural Networks for Multi-Label Authorship Attribution
Dainis Boumber | Yifan Zhang | Arjun Mukherjee
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

pdf bib
Attending Sentences to detect Satirical Fake News
Sohan De Sarkar | Fan Yang | Arjun Mukherjee
Proceedings of the 27th International Conference on Computational Linguistics

Satirical news detection is important in order to prevent the spread of misinformation over the Internet. Existing approaches to capture news satire use machine learning models such as SVM and hierarchical neural networks along with hand-engineered features, but do not explore sentence and document difference. This paper proposes a robust, hierarchical deep neural network approach for satire detection, which is capable of capturing satire both at the sentence level and at the document level. The architecture incorporates pluggable generic neural networks like CNN, GRU, and LSTM. Experimental results on real world news satire dataset show substantial performance gains demonstrating the effectiveness of our proposed approach. An inspection of the learned models reveals the existence of key sentences that control the presence of satire in news.

2017

pdf bib
Satirical News Detection and Analysis using Attention Mechanism and Linguistic Features
Fan Yang | Arjun Mukherjee | Eduard Dragut
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

Satirical news is considered to be entertainment, but it is potentially deceptive and harmful. Despite the embedded genre in the article, not everyone can recognize the satirical cues and therefore believe the news as true news. We observe that satirical cues are often reflected in certain paragraphs rather than the whole document. Existing works only consider document-level features to detect the satire, which could be limited. We consider paragraph-level linguistic features to unveil the satire by incorporating neural network and attention mechanism. We investigate the difference between paragraph-level features and document-level features, and analyze them on a large satirical news dataset. The evaluation shows that the proposed model detects satirical news effectively and reveals what features are important at which level.

2016

pdf bib
Extracting Aspect Specific Opinion Expressions
Abhishek Laddha | Arjun Mukherjee
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf bib
Leveraging Multiple Domains for Sentiment Classification
Fan Yang | Arjun Mukherjee | Yifan Zhang
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Sentiment classification becomes more and more important with the rapid growth of user generated content. However, sentiment classification task usually comes with two challenges: first, sentiment classification is highly domain-dependent and training sentiment classifier for every domain is inefficient and often impractical; second, since the quantity of labeled data is important for assessing the quality of classifier, it is hard to evaluate classifiers when labeled data is limited for certain domains. To address the challenges mentioned above, we focus on learning high-level features that are able to generalize across domains, so a global classifier can benefit with a simple combination of documents from multiple domains. In this paper, the proposed model incorporates both sentiment polarity and unlabeled data from multiple domains and learns new feature representations. Our model doesn’t require labels from every domain, which means the learned feature representation can be generalized for sentiment domain adaptation. In addition, the learned feature representation can be used as classifier since our model defines the meaning of feature value and arranges high-level features in a prefixed order, so it is not necessary to train another classifier on top of the new features. Empirical evaluations demonstrate our model outperforms baselines and yields competitive results to other state-of-the-art works on benchmark datasets.

pdf bib
Analysis of Anxious Word Usage on Online Health Forums
Nicolas Rey-Villamizar | Prasha Shrestha | Farig Sadeque | Steven Bethard | Ted Pedersen | Arjun Mukherjee | Thamar Solorio
Proceedings of the Seventh International Workshop on Health Text Mining and Information Analysis

2015

pdf bib
Detecting Deceptive Opinion Spam using Linguistics, Behavioral and Statistical Modeling
Arjun Mukherjee
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing: Tutorial Abstracts

2014

pdf bib
Aspect Extraction with Automated Prior Knowledge Learning
Zhiyuan Chen | Arjun Mukherjee | Bing Liu
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Predicting Interesting Things in Text
Michael Gamon | Arjun Mukherjee | Patrick Pantel
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

pdf bib
Exploiting Social Relations and Sentiment for Stock Prediction
Jianfeng Si | Arjun Mukherjee | Bing Liu | Sinno Jialin Pan | Qing Li | Huayi Li
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

2013

pdf bib
Exploiting Domain Knowledge in Aspect Extraction
Zhiyuan Chen | Arjun Mukherjee | Bing Liu | Meichun Hsu | Malu Castellanos | Riddhiman Ghosh
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

pdf bib
Discovering User Interactions in Ideological Discussions
Arjun Mukherjee | Bing Liu
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Public Dialogue: Analysis of Tolerance in Online Discussions
Arjun Mukherjee | Vivek Venkataraman | Bing Liu | Sharon Meraz
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Exploiting Topic based Twitter Sentiment for Stock Prediction
Jianfeng Si | Arjun Mukherjee | Bing Liu | Qing Li | Huayi Li | Xiaotie Deng
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

2012

pdf bib
Modeling Review Comments
Arjun Mukherjee | Bing Liu
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Aspect Extraction through Semi-Supervised Modeling
Arjun Mukherjee | Bing Liu
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Analysis of Linguistic Style Accommodation in Online Debates
Arjun Mukherjee | Bing Liu
Proceedings of COLING 2012

2010

pdf bib
Improving Gender Classification of Blog Authors
Arjun Mukherjee | Bing Liu
Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing