Gayatri Bhat


2019

pdf bib
A Margin-based Loss with Synthetic Negative Samples for Continuous-output Machine Translation
Gayatri Bhat | Sachin Kumar | Yulia Tsvetkov
Proceedings of the 3rd Workshop on Neural Generation and Translation

Neural models that eliminate the softmax bottleneck by generating word embeddings (rather than multinomial distributions over a vocabulary) attain faster training with fewer learnable parameters. These models are currently trained by maximizing densities of pretrained target embeddings under von Mises-Fisher distributions parameterized by corresponding model-predicted embeddings. This work explores the utility of margin-based loss functions in optimizing such models. We present syn-margin loss, a novel margin-based loss that uses a synthetic negative sample constructed from only the predicted and target embeddings at every step. The loss is efficient to compute, and we use a geometric analysis to argue that it is more consistent and interpretable than other margin-based losses. Empirically, we find that syn-margin provides small but significant improvements over both vMF and standard margin-based losses in continuous-output neural machine translation.

pdf bib
CMU-01 at the SIGMORPHON 2019 Shared Task on Crosslinguality and Context in Morphology
Aditi Chaudhary | Elizabeth Salesky | Gayatri Bhat | David R. Mortensen | Jaime Carbonell | Yulia Tsvetkov
Proceedings of the 16th Workshop on Computational Research in Phonetics, Phonology, and Morphology

This paper presents the submission by the CMU-01 team to the SIGMORPHON 2019 task 2 of Morphological Analysis and Lemmatization in Context. This task requires us to produce the lemma and morpho-syntactic description of each token in a sequence, for 107 treebanks. We approach this task with a hierarchical neural conditional random field (CRF) model which predicts each coarse-grained feature (eg. POS, Case, etc.) independently. However, most treebanks are under-resourced, thus making it challenging to train deep neural models for them. Hence, we propose a multi-lingual transfer training regime where we transfer from multiple related languages that share similar typology.

2018

pdf bib
Language Modeling for Code-Mixing: The Role of Linguistic Theory based Synthetic Data
Adithya Pratapa | Gayatri Bhat | Monojit Choudhury | Sunayana Sitaram | Sandipan Dandapat | Kalika Bali
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Training language models for Code-mixed (CM) language is known to be a difficult problem because of lack of data compounded by the increased confusability due to the presence of more than one language. We present a computational technique for creation of grammatically valid artificial CM data based on the Equivalence Constraint Theory. We show that when training examples are sampled appropriately from this synthetic data and presented in certain order (aka training curriculum) along with monolingual and real CM data, it can significantly reduce the perplexity of an RNN-based language model. We also show that randomly generated CM data does not help in decreasing the perplexity of the LMs.