Anupam Basu


2016

pdf bib
Effect of Syntactic Features in Bangla Sentence Comprehension
Manjira Sinha | Tirthankar Dasgupta | Anupam Basu
Proceedings of the 13th International Conference on Natural Language Processing

2015

pdf bib
Mining HEXACO personality traits from Enterprise Social Media
Priyanka Sinha | Lipika Dey | Pabitra Mitra | Anupam Basu
Proceedings of the 6th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis

pdf bib
Compositionality in Bangla Compound Verbs and their Processing in the Mental Lexicon
Tirthankar Dasgupta | Manjira Sinha | Anupam Basu
Proceedings of the 12th International Conference on Natural Language Processing

2014

pdf bib
Text Readability in Hindi: A Comparative Study of Feature Performances Using Support Vectors
Manjira Sinha | Tirthankar Dasgupta | Anupam Basu
Proceedings of the 11th International Conference on Natural Language Processing

pdf bib
Influence of Target Reader Background and Text Features on Text Readability in Bangla: A Computational Approach
Manjira Sinha | Tirthankar Dasgupta | Anupam Basu
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

pdf bib
Design and Development of an Online Computational Framework to Facilitate Language Comprehension Research on Indian Languages
Manjira Sinha | Tirthankar Dasgupta | Anupam Basu
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

In this paper we have developed an open-source online computational framework that can be used by different research groups to conduct reading researches on Indian language texts. The framework can be used to develop a large annotated Indian language text comprehension data from different user based experiments. The novelty in this framework lies in the fact that it brings different empirical data-collection techniques for text comprehension under one roof. The framework has been customized specifically to address language particularities for Indian languages. It will also offer many types of automatic analysis on the data at different levels such as full text, sentence and word level. To address the subjectivity of text difficulty perception, the framework allows to capture user background against multiple factors. The assimilated data can be automatically cross referenced against varying strata of readers.

2012

pdf bib
Forward Transliteration of Dzongkha Text to Braille
Tirthankar Dasgupta | Manjira Sinha | Anupam Basu
Proceedings of the Second Workshop on Advances in Text Input Methods

pdf bib
Automatic Extraction of Compound Verbs from Bangla Corpora
Sibanshu Mukhopadhayay | Tirthankar Dasgupta | Manjira Sinha | Anupam Basu
Proceedings of the 3rd Workshop on South and Southeast Asian Natural Language Processing

pdf bib
A New Semantic Lexicon and Similarity Measure in Bangla
Manjira Sinha | Abhik Jana | Tirthankar Dasgupta | Anupam Basu
Proceedings of the 3rd Workshop on Cognitive Aspects of the Lexicon

pdf bib
A Hybrid Dependency Parser for Bangla
Arnab Dhar | Sanjay Chatterji | Sudeshna Sarkar | Anupam Basu
Proceedings of the 10th Workshop on Asian Language Resources

pdf bib
Repairing Bengali Verb Chunks for Improved Bengali to Hindi Machine Translation
Sanjay Chatterji | Nabanita Datta | Arnab Dhar | Biswanath Barik | Sudeshna Sarkar | Anupam Basu
Proceedings of the 10th Workshop on Asian Language Resources

pdf bib
Translations of Ambiguous Hindi Pronouns to Possible Bengali Pronouns
Sanjay Chatterji | Sudeshna Sarkar | Anupam Basu
Proceedings of the 10th Workshop on Asian Language Resources

pdf bib
A Three Stage Hybrid Parser for Hindi
Sanjay Chatterji | Arnad Dhar | Sudeshna Sarkar | Anupam Basu
Proceedings of the Workshop on Machine Translation and Parsing in Indian Languages

pdf bib
Modelling the Organization and Processing of Bangla Polymorphemic Words in the Mental Lexicon: A Computational Approach
Tirthankar Dasgupta | Manjira Sinha | Anupam Basu
Proceedings of COLING 2012: Posters

pdf bib
New Readability Measures for Bangla and Hindi Texts
Manjira Sinha | Sakshi Sharma | Tirthankar Dasgupta | Anupam Basu
Proceedings of COLING 2012: Posters

2010

pdf bib
Determining Reliability of Subjective and Multi-label Emotion Annotation through Novel Fuzzy Agreement Measure
Plaban Kr. Bhowmick | Anupam Basu | Pabitra Mitra
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

The paper presents a new fuzzy agreement measure $\gamma_f$ for determining the agreement in multi-label and subjective annotation task. In this annotation framework, one data item may belong to a category or a class with a belief value denoting the degree of confidence of an annotator in assigning the data item to that category. We have provided a notion of disagreement based on the belief values provided by the annotators with respect to a category. The fuzzy agreement measure $\gamma_f$ has been proposed by defining different fuzzy agreement sets based on the distribution of difference of belief values provided by the annotators. The fuzzy agreement has been computed by studying the average agreement over all the data items and annotators. Finally, we elaborate on the computation $\gamma_f$ measure with a case study on emotion text data where a data item (sentence) may belong to more than one emotion category with varying belief values.

pdf bib
Resource Creation for Training and Testing of Transliteration Systems for Indian Languages
Sowmya V. B. | Monojit Choudhury | Kalika Bali | Tirthankar Dasgupta | Anupam Basu
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

Machine transliteration is used in a number of NLP applications ranging from machine translation and information retrieval to input mechanisms for non-roman scripts. Many popular Input Method Editors for Indian languages, like Baraha, Akshara, Quillpad etc, use back-transliteration as a mechanism to allow users to input text in a number of Indian language. The lack of a standard dataset to evaluate these systems makes it difficult to make any meaningful comparisons of their relative accuracies. In this paper, we describe the methodology for the creation of a dataset of ~2500 transliterated sentence pairs each in Bangla, Hindi and Telugu. The data was collected across three different modes from a total of 60 users. We believe that this dataset will prove useful not only for the evaluation and training of back-transliteration systems but also help in the linguistic analysis of the process of transliterating Indian languages from native scripts to Roman.

2009

pdf bib
Language Diversity across the Consonant Inventories: A Study in the Framework of Complex Networks
Monojit Choudhury | Animesh Mukherjee | Anupam Basu | Niloy Ganguly | Ashish Garg | Vaibhav Jalan
Proceedings of the EACL 2009 Workshop on Cognitive Aspects of Computational Language Acquisition

2008

pdf bib
Prototype Machine Translation System From Text-To-Indian Sign Language
Tirthankar Dasgupta | Sandipan Dandpat | Anupam Basu
Proceedings of the IJCNLP-08 Workshop on NLP for Less Privileged Languages

bib
A Multilingual Multimedia Indian Sign Language Dictionary Tool
Tirthankar Dasgupta | Sambit Shukla | Sandeep Kumar | Synny Diwakar | Anupam Basu
Proceedings of the 6th Workshop on Asian Language Resources

pdf bib
An Agreement Measure for Determining Inter-Annotator Reliability of Human Judgements on Affective Text
Plaban Kumar Bhowmick | Anupam Basu | Pabitra Mitra
Coling 2008: Proceedings of the workshop on Human Judgements in Computational Linguistics

pdf bib
Modeling the Structure and Dynamics of the Consonant Inventories: A Complex Network Approach
Animesh Mukherjee | Monojit Choudhury | Anupam Basu | Niloy Ganguly
Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)

2007

pdf bib
How Difficult is it to Develop a Perfect Spell-checker? A Cross-Linguistic Analysis through Complex Network Approach
Monojit Choudhury | Markose Thomas | Animesh Mukherjee | Anupam Basu | Niloy Ganguly
Proceedings of the Second Workshop on TextGraphs: Graph-Based Algorithms for Natural Language Processing

pdf bib
Evolution, Optimization, and Language Change: The Case of Bengali Verb Inflections
Monojit Choudhury | Vaibhav Jalan | Sudeshna Sarkar | Anupam Basu
Proceedings of Ninth Meeting of the ACL Special Interest Group in Computational Morphology and Phonology

pdf bib
Emergence of Community Structures in Vowel Inventories: An Analysis Based on Complex Networks
Animesh Mukherjee | Monojit Choudhury | Anupam Basu | Niloy Ganguly
Proceedings of Ninth Meeting of the ACL Special Interest Group in Computational Morphology and Phonology

pdf bib
Redundancy Ratio: An Invariant Property of the Consonant Inventories of the World’s Languages
Animesh Mukherjee | Monojit Choudhury | Anupam Basu | Niloy Ganguly
Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics

pdf bib
Automatic Part-of-Speech Tagging for Bengali: An Approach for Morphologically Rich Languages in a Poor Resource Scenario
Sandipan Dandapat | Sudeshna Sarkar | Anupam Basu
Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics Companion Volume Proceedings of the Demo and Poster Sessions

2006

pdf bib
Analysis and Synthesis of the Distribution of Consonants over Languages: A Complex Network Approach
Monojit Choudhury | Animesh Mukherjee | Anupam Basu | Niloy Ganguly
Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions

2004

pdf bib
A Diachronic Approach for Schwa Deletion in Indo Aryan Languages
Monojit Choudhury | Anupam Basu | Sudeshna Sarkar
Proceedings of the 7th Meeting of the ACL Special Interest Group in Computational Phonology: Current Themes in Computational Phonology and Morphology