Dipankar Das


2020

pdf bib
JUNLP at SemEval-2020 Task 9: Sentiment Analysis of Hindi-English Code Mixed Data Using Grid Search Cross Validation
Avishek Garain | Sainik Mahata | Dipankar Das
Proceedings of the Fourteenth Workshop on Semantic Evaluation

Code-mixing is a phenomenon which arises mainly in multilingual societies. Multilingual people, who are well versed in their native languages and also English speakers, tend to code-mix using English-based phonetic typing and the insertion of anglicisms in their main language. This linguistic phenomenon poses a great challenge to conventional NLP domains such as Sentiment Analysis, Machine Translation, and Text Summarization, to name a few. In this work, we focus on working out a plausible solution to the domain of Code-Mixed Sentiment Analysis. This work was done as participation in the SemEval-2020 Sentimix Task, where we focused on the sentiment analysis of English-Hindi code-mixed sentences. our username for the submission was “sainik.mahata” and team name was “JUNLP”. We used feature extraction algorithms in conjunction with traditional machine learning algorithms such as SVR and Grid Search in an attempt to solve the task. Our approach garnered an f1-score of 66.2% when tested using metrics prepared by the organizers of the task.

2019

pdf bib
JUMT at WMT2019 News Translation Task: A Hybrid Approach to Machine Translation for Lithuanian to English
Sainik Kumar Mahata | Avishek Garain | Adityar Rayala | Dipankar Das | Sivaji Bandyopadhyay
Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)

In the current work, we present a description of the system submitted to WMT 2019 News Translation Shared task. The system was created to translate news text from Lithuanian to English. To accomplish the given task, our system used a Word Embedding based Neural Machine Translation model to post edit the outputs generated by a Statistical Machine Translation model. The current paper documents the architecture of our model, descriptions of the various modules and the results produced using the same. Our system garnered a BLEU score of 17.6.

pdf bib
NLP at SemEval-2019 Task 6: Detecting Offensive language using Neural Networks
Prashant Kapil | Asif Ekbal | Dipankar Das
Proceedings of the 13th International Workshop on Semantic Evaluation

In this paper we built several deep learning architectures to participate in shared task OffensEval: Identifying and categorizing Offensive language in Social media by semEval-2019. The dataset was annotated with three level annotation schemes and task was to detect between offensive and not offensive, categorization and target identification in offensive contents. Deep learning models with POS information as feature were also leveraged for classification. The three best models that performed best on individual sub tasks are stacking of CNN-Bi-LSTM with Attention, BiLSTM with POS information added with word features and Bi-LSTM for third task. Our models achieved a Macro F1 score of 0.7594, 0.5378 and 0.4588 in Task(A,B,C) respectively with rank of 33rd, 54th and 52nd out of 103, 75 and 65 submissions.The three best models that performed best on individual sub task are using Neural Networks.

2018

pdf bib
JUCBNMT at WMT2018 News Translation Task: Character Based Neural Machine Translation of Finnish to English
Sainik Kumar Mahata | Dipankar Das | Sivaji Bandyopadhyay
Proceedings of the Third Conference on Machine Translation: Shared Task Papers

In the current work, we present a description of the system submitted to WMT 2018 News Translation Shared task. The system was created to translate news text from Finnish to English. The system used a Character Based Neural Machine Translation model to accomplish the given task. The current paper documents the preprocessing steps, the description of the submitted system and the results produced using the same. Our system garnered a BLEU score of 12.9.

2017

pdf bib
JUNLP at IJCNLP-2017 Task 3: A Rank Prediction Model for Review Opinion Diversification
Monalisa Dey | Anupam Mondal | Dipankar Das
Proceedings of the IJCNLP 2017, Shared Tasks

IJCNLP-17 Review Opinion Diversification (RevOpiD-2017) task has been designed for ranking the top-k reviews of a product from a set of reviews, which assists in identifying a summarized output to express the opinion of the entire review set. The task is divided into three independent subtasks as subtask-A,subtask-B, and subtask-C. Each of these three subtasks selects the top-k reviews based on helpfulness, representativeness, and exhaustiveness of the opinions expressed in the review set individually. In order to develop the modules and predict the rank of reviews for all three subtasks, we have employed two well-known supervised classifiers namely, Naïve Bayes and Logistic Regression on the top of several extracted features such as the number of nouns, number of verbs, and number of sentiment words etc from the provided datasets. Finally, the organizers have helped to validate the predicted outputs for all three subtasks by using their evaluation metrics. The metrics provide the scores of list size 5 as (0.80 (mth)) for subtask-A, (0.86 (cos), 0.87 (cos d), 0.71 (cpr), 4.98 (a-dcg), and 556.94 (wt)) for subtask B, and (10.94 (unwt) and 0.67 (recall)) for subtask C individually.

pdf bib
NITMZ-JU at IJCNLP-2017 Task 4: Customer Feedback Analysis
Somnath Banerjee | Partha Pakray | Riyanka Manna | Dipankar Das | Alexander Gelbukh
Proceedings of the IJCNLP 2017, Shared Tasks

In this paper, we describe a deep learning framework for analyzing the customer feedback as part of our participation in the shared task on Customer Feedback Analysis at the 8th International Joint Conference on Natural Language Processing (IJCNLP 2017). A Convolutional Neural Network (CNN) based deep neural network model was employed for the customer feedback task. The proposed system was evaluated on two languages, namely, English and French.

pdf bib
JU NITM at IJCNLP-2017 Task 5: A Classification Approach for Answer Selection in Multi-choice Question Answering System
Sandip Sarkar | Dipankar Das | Partha Pakray
Proceedings of the IJCNLP 2017, Shared Tasks

This paper describes the participation of the JU NITM team in IJCNLP-2017 Task 5: “Multi-choice Question Answering in Examinations”. The main aim of this shared task is to choose the correct option for each multi-choice question. Our proposed model includes vector representations as feature and machine learning for classification. At first we represent question and answer in vector space and after that find the cosine similarity between those two vectors. Finally we apply classification approach to find the correct answer. Our system was only developed for the English language, and it obtained an accuracy of 40.07% for test dataset and 40.06% for valid dataset.

pdf bib
JU CSE NLP @ SemEval 2017 Task 7: Employing Rules to Detect and Interpret English Puns
Aniket Pramanick | Dipankar Das
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)

System description. Implementation of HMM and Cyclic Dependency Network.

pdf bib
Identification of Character Adjectives from Mahabharata
Apurba Paul | Dipankar Das
Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017

The present paper describes the identification of prominent characters and their adjectives from Indian mythological epic, Mahabharata, written in English texts. However, in contrast to the tra-ditional approaches of named entity identifica-tion, the present system extracts hidden attributes associated with each of the characters (e.g., character adjectives). We observed distinct phrase level linguistic patterns that hint the pres-ence of characters in different text spans. Such six patterns were used in order to extract the cha-racters. On the other hand, a distinguishing set of novel features (e.g., multi-word expression, nodes and paths of parse tree, immediate ancestors etc.) was employed. Further, the correlation of the features is also measured in order to identify the important features. Finally, we applied various machine learning algorithms (e.g., Naive Bayes, KNN, Logistic Regression, Decision Tree, Random Forest etc.) along with deep learning to classify the patterns as characters or non-characters in order to achieve decent accuracy. Evaluation shows that phrase level linguistic patterns as well as the adopted features are highly active in capturing characters and their adjectives.

pdf bib
BUCC2017: A Hybrid Approach for Identifying Parallel Sentences in Comparable Corpora
Sainik Mahata | Dipankar Das | Sivaji Bandyopadhyay
Proceedings of the 10th Workshop on Building and Using Comparable Corpora

A Statistical Machine Translation (SMT) system is always trained using large parallel corpus to produce effective translation. Not only is the corpus scarce, it also involves a lot of manual labor and cost. Parallel corpus can be prepared by employing comparable corpora where a pair of corpora is in two different languages pointing to the same domain. In the present work, we try to build a parallel corpus for French-English language pair from a given comparable corpus. The data and the problem set are provided as part of the shared task organized by BUCC 2017. We have proposed a system that first translates the sentences by heavily relying on Moses and then group the sentences based on sentence length similarity. Finally, the one to one sentence selection was done based on Cosine Similarity algorithm.

pdf bib
Relationship Extraction based on Category of Medical Concepts from Lexical Contexts
Anupam Mondal | Dipankar Das | Sivaji Bandyopadhyay
Proceedings of the 14th International Conference on Natural Language Processing (ICON-2017)

pdf bib
Retrieving Similar Lyrics for Music Recommendation System
Braja Gopal Patra | Dipankar Das | Sivaji Bandyopadhyay
Proceedings of the 14th International Conference on Natural Language Processing (ICON-2017)

pdf bib
Developing Lexicon and Classifier for Personality Identification in Texts
Kumar Gourav Das | Dipankar Das
Proceedings of the 14th International Conference on Natural Language Processing (ICON-2017)

pdf bib
A Deep Dive into Identification of Characters from Mahabharata
Apurba Paul | Dipankar Das
Proceedings of the 14th International Conference on Natural Language Processing (ICON-2017)

2016

pdf bib
Multimodal Mood Classification - A Case Study of Differences in Hindi and Western Songs
Braja Gopal Patra | Dipankar Das | Sivaji Bandyopadhyay
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Music information retrieval has emerged as a mainstream research area in the past two decades. Experiments on music mood classification have been performed mainly on Western music based on audio, lyrics and a combination of both. Unfortunately, due to the scarcity of digitalized resources, Indian music fares poorly in music mood retrieval research. In this paper, we identified the mood taxonomy and prepared multimodal mood annotated datasets for Hindi and Western songs. We identified important audio and lyric features using correlation based feature selection technique. Finally, we developed mood classification systems using Support Vector Machines and Feed Forward Neural Networks based on the features collected from audio, lyrics, and a combination of both. The best performing multimodal systems achieved F-measures of 75.1 and 83.5 for classifying the moods of the Hindi and Western songs respectively using Feed Forward Neural Networks. A comparative analysis indicates that the selected features work well for mood classification of the Western songs and produces better results as compared to the mood classification systems for Hindi songs.

pdf bib
WMT2016: A Hybrid Approach to Bilingual Document Alignment
Sainik Mahata | Dipankar Das | Santanu Pal
Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers

pdf bib
Unraveling the English-Bengali Code-Mixing Phenomenon
Arunavha Chanda | Dipankar Das | Chandan Mazumdar
Proceedings of the Second Workshop on Computational Approaches to Code Switching

pdf bib
Part-of-speech Tagging of Code-Mixed Social Media Text
Souvick Ghosh | Satanu Ghosh | Dipankar Das
Proceedings of the Second Workshop on Computational Approaches to Code Switching

pdf bib
Columbia-Jadavpur submission for EMNLP 2016 Code-Switching Workshop Shared Task: System description
Arunavha Chanda | Dipankar Das | Chandan Mazumdar
Proceedings of the Second Workshop on Computational Approaches to Code Switching

pdf bib
JU_NLP at SemEval-2016 Task 6: Detecting Stance in Tweets using Support Vector Machines
Braja Gopal Patra | Dipankar Das | Sivaji Bandyopadhyay
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)

pdf bib
JUNITMZ at SemEval-2016 Task 1: Identifying Semantic Similarity Using Levenshtein Ratio
Sandip Sarkar | Dipankar Das | Partha Pakray | Alexander Gelbukh
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)

pdf bib
JU_NLP at SemEval-2016 Task 11: Identifying Complex Words in a Sentence
Niloy Mukherjee | Braja Gopal Patra | Dipankar Das | Sivaji Bandyopadhyay
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)

pdf bib
JUNLP at SemEval-2016 Task 13: A Language Independent Approach for Hypernym Identification
Promita Maitra | Dipankar Das
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)

2015

pdf bib
Identification and Classification of Emotional Key Phrases from Psychological Texts
Apurba Paul | Dipankar Das
Proceedings of the ACL 2015 Workshop on Novel Computational Approaches to Keyphrase Extraction

pdf bib
Mood Classification of Hindi Songs based on Lyrics
Braja Gopal Patra | Dipankar Das | Sivaji Bandyopadhyay
Proceedings of the 12th International Conference on Natural Language Processing

2014

pdf bib
JU_CSE: A Conditional Random Field (CRF) Based Approach to Aspect Based Sentiment Analysis
Braja Gopal Patra | Soumik Mandal | Dipankar Das | Sivaji Bandyopadhyay
Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014)

pdf bib
How Sentiment Analysis Can Help Machine Translation
Santanu Pal | Braja Gopal Patra | Dipankar Das | Sudip Kumar Naskar | Sivaji Bandyopadhyay | Josef van Genabith
Proceedings of the 11th International Conference on Natural Language Processing

2013

pdf bib
Construction of Emotional Lexicon Using Potts Model
Braja Gopal Patra | Hiroya Takamura | Dipankar Das | Manabu Okumura | Sivaji Bandyopadhyay
Proceedings of the Sixth International Joint Conference on Natural Language Processing

pdf bib
Emotion Co-referencing - Emotional Expression, Holder, and Topic
Dipankar Das | Sivaji Bandyopadhyay
International Journal of Computational Linguistics & Chinese Language Processing, Volume 18, Number 1, March 2013

pdf bib
Automatic Music Mood Classification of Hindi Songs
Braja Gopal Patra | Dipankar Das | Sivaji Bandyopadhyay
Proceedings of the 3rd Workshop on Sentiment Analysis where AI meets Psychology

2012

pdf bib
Morphological Analyzer for Kokborok
Khumbar Debbarma | Braja Gopal Patra | Dipankar Das | Sivaji Bandyopadhyay
Proceedings of the 3rd Workshop on South and Southeast Asian Natural Language Processing

pdf bib
Classification of Interviews - A Case Study on Cancer Patients
Braja Gopal Patra | Amitava Kundu | Dipankar Das | Sivaji Bandyopadhyay
Proceedings of the 2nd Workshop on Sentiment Analysis where AI meets Psychology

pdf bib
Part of Speech (POS) Tagger for Kokborok
Braja Gopal Patra | Khumbar Debbarma | Dipankar Das | Sivaji Bandyopadhyay
Proceedings of COLING 2012: Posters

pdf bib
A Light Weight Stemmer in Kokborok
Braja Gopal Patra | Khumbar Debbarma | Swapan Debbarma | Dipankar Das | Amitava Das | Sivaji Bandyopadhyay
Proceedings of the 24th Conference on Computational Linguistics and Speech Processing (ROCLING 2012)

2011

pdf bib
Semantic Clustering: an Attempt to Identify Multiword Expressions in Bengali
Tanmoy Chakraborty | Dipankar Das | Sivaji Bandyopadhyay
Proceedings of the Workshop on Multiword Expressions: from Parsing and Generation to the Real World

pdf bib
Identifying Event-Sentiment Association using Lexical Equivalence and Co-reference Approaches
Anup Kolya | Dipankar Das | Asif Ekbal | Sivaji Bandyopadhyay
Proceedings of the ACL 2011 Workshop on Relational Models of Semantics

pdf bib
Developing Japanese WordNet Affect for Analyzing Emotions
Yoshimitsu Torii | Dipankar Das | Sivaji Bandyopadhyay | Manabu Okumura
Proceedings of the 2nd Workshop on Computational Approaches to Subjectivity and Sentiment Analysis (WASSA 2.011)

pdf bib
Analyzing Emotional Statements – Roles of General and Physiological Variables
Dipankar Das | Sivaji Bandyopadhyay
Proceedings of the Workshop on Sentiment Analysis where AI meets Psychology (SAAIP 2011)

2010

pdf bib
Identifying Emotional Expressions, Intensities and Sentence Level Emotion Tags Using a Supervised Framework
Dipankar Das | Sivaji Bandyopadhyay
Proceedings of the 24th Pacific Asia Conference on Language, Information and Computation

pdf bib
Finding Emotion Holder from Bengali Blog Texts—An Unsupervised Syntactic Approach
Dipankar Das | Sivaji Bandyopadhyay
Proceedings of the 24th Pacific Asia Conference on Language, Information and Computation

pdf bib
Labeling Emotion in Bengali Blog Corpus – A Fine Grained Tagging at Sentence Level
Dipankar Das | Sivaji Bandyopadhyay
Proceedings of the Eighth Workshop on Asian Language Resouces

pdf bib
Automatic Extraction of Complex Predicates in Bengali
Dipankar Das | Santanu Pal | Tapabrata Mondal | Tanmoy Chakraborty | Sivaji Bandyopadhyay
Proceedings of the 2010 Workshop on Multiword Expressions: from Theory to Applications

pdf bib
JU: A Supervised Approach to Identify Semantic Relations from Paired Nominals
Santanu Pal | Partha Pakray | Dipankar Das | Sivaji Bandyopadhyay
Proceedings of the 5th International Workshop on Semantic Evaluation

pdf bib
Discerning Emotions of Bloggers based on Topics – a Supervised Coreference Approach in Bengali
Dipankar Das | Sivaji Bandyopadhyay
ROCLING 2010 Poster Papers

2009

pdf bib
Bengali Verb Subcategorization Frame Acquisition - A Baseline Model
Somnath Banerjee | Dipankar Das | Sivaji Bandyopadhyay
Proceedings of the 7th Workshop on Asian Language Resources (ALR7)

pdf bib
Word to Sentence Level Emotion Tagging for Bengali Blogs
Dipankar Das | Sivaji Bandyopadhyay
Proceedings of the ACL-IJCNLP 2009 Conference Short Papers