Jasabanta Patro


pdf bib
Code-Switching Patterns Can Be an Effective Route to Improve Performance of Downstream NLP Applications: A Case Study of Humour, Sarcasm and Hate Speech Detection
Srijan Bansal | Vishal Garimella | Ayush Suhane | Jasabanta Patro | Animesh Mukherjee
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

In this paper, we demonstrate how code-switching patterns can be utilised to improve various downstream NLP applications. In particular, we encode various switching features to improve humour, sarcasm and hate speech detection tasks. We believe that this simple linguistic observation can also be potentially helpful in improving other similar NLP applications.


pdf bib
A deep-learning framework to detect sarcasm targets
Jasabanta Patro | Srijan Bansal | Animesh Mukherjee
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

In this paper we propose a deep learning framework for sarcasm target detection in predefined sarcastic texts. Identification of sarcasm targets can help in many core natural language processing tasks such as aspect based sentiment analysis, opinion mining etc. To begin with, we perform an empirical study of the socio-linguistic features and identify those that are statistically significant in indicating sarcasm targets (p-values in the range(0.05,0.001)). Finally, we present a deep-learning framework augmented with socio-linguistic features to detect sarcasm targets in sarcastic book-snippets and tweets.We achieve a huge improvement in the performance in terms of exact match and dice scores compared to the current state-of-the-art baseline.

pdf bib
KGPChamps at SemEval-2019 Task 3: A deep learning approach to detect emotions in the dialog utterances.
Jasabanta Patro | Nitin Choudhary | Kalpit Chittora | Animesh Mukherjee
Proceedings of the 13th International Workshop on Semantic Evaluation

This paper describes our approach to solve Semeval task 3: EmoContext; where, given a textual dialogue i.e. a user utterance along with two turns of context, we have to classify the emotion associated with the utterance as one of the following emotion classes: Happy, Sad, Angry or Others. To solve this problem, we experiment with different deep learning models ranging from simple bidirectional LSTM (Long and short term memory) model to comparatively complex attention model. We also experiment with word embedding conceptnet along with word embedding generated from bi-directional LSTM taking input characters. We fine-tune different parameters and hyper-parameters associated with each of our models and report the value of evaluating measure i.e. micro precision along with class wise precision, recall and F1-score of each system. We report the bidirectional LSTM model, along with the input word embedding as the concatenation of word embedding generated from bidirectional LSTM for word characters and conceptnet embedding, as the best performing model with a highest micro-F1 score of 0.7261. We also report class wise precision, recall, and f1-score of best performing model along with other models that we have experimented with.


pdf bib
All that is English may be Hindi: Enhancing language identification through automatic ranking of the likeliness of word borrowing in social media
Jasabanta Patro | Bidisha Samanta | Saurabh Singh | Abhipsa Basu | Prithwish Mukherjee | Monojit Choudhury | Animesh Mukherjee
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

n this paper, we present a set of computational methods to identify the likeliness of a word being borrowed, based on the signals from social media. In terms of Spearman’s correlation values, our methods perform more than two times better (∼ 0.62) in predicting the borrowing likeliness compared to the best performing baseline (∼ 0.26) reported in literature. Based on this likeliness estimate we asked annotators to re-annotate the language tags of foreign words in predominantly native contexts. In 88% of cases the annotators felt that the foreign language tag should be replaced by native language tag, thus indicating a huge scope for improvement of automatic language identification systems.