Mohit Yadav


pdf bib
Unsupervised Latent Tree Induction with Deep Inside-Outside Recursive Auto-Encoders
Andrew Drozdov | Patrick Verga | Mohit Yadav | Mohit Iyyer | Andrew McCallum
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

We introduce the deep inside-outside recursive autoencoder (DIORA), a fully-unsupervised method for discovering syntax that simultaneously learns representations for constituents within the induced tree. Our approach predicts each word in an input sentence conditioned on the rest of the sentence. During training we use dynamic programming to consider all possible binary trees over the sentence, and for inference we use the CKY algorithm to extract the highest scoring parse. DIORA outperforms previously reported results for unsupervised binary constituency parsing on the benchmark WSJ dataset.


pdf bib
Learning and Knowledge Transfer with Memory Networks for Machine Comprehension
Mohit Yadav | Lovekesh Vig | Gautam Shroff
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers

Enabling machines to read and comprehend unstructured text remains an unfulfilled goal for NLP research. Recent research efforts on the “machine comprehension” task have managed to achieve close to ideal performance on simulated data. However, achieving similar levels of performance on small real world datasets has proved difficult; major challenges stem from the large vocabulary size, complex grammar, and, the frequent ambiguities in linguistic structure. On the other hand, the requirement of human generated annotations for training, in order to ensure a sufficiently diverse set of questions is prohibitively expensive. Motivated by these practical issues, we propose a novel curriculum inspired training procedure for Memory Networks to improve the performance for machine comprehension with relatively small volumes of training data. Additionally, we explore various training regimes for Memory Networks to allow knowledge transfer from a closely related domain having larger volumes of labelled data. We also suggest the use of a loss function to incorporate the asymmetric nature of knowledge transfer. Our experiments demonstrate improvements on Dailymail, CNN, and MCTest datasets.

pdf bib
Information Bottleneck Inspired Method For Chat Text Segmentation
S Vishal | Mohit Yadav | Lovekesh Vig | Gautam Shroff
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

We present a novel technique for segmenting chat conversations using the information bottleneck method (Tishby et al., 2000), augmented with sequential continuity constraints. Furthermore, we utilize critical non-textual clues such as time between two consecutive posts and people mentions within the posts. To ascertain the effectiveness of the proposed method, we have collected data from public Slack conversations and Fresco, a proprietary platform deployed inside our organization. Experiments demonstrate that the proposed method yields an absolute (relative) improvement of as high as 3.23% (11.25%). To facilitate future research, we are releasing manual annotations for segmentation on public Slack conversations.