Nikhil Gupta


2020

pdf bib
Covidex: Neural Ranking Models and Keyword Search Infrastructure for the COVID-19 Open Research Dataset
Edwin Zhang | Nikhil Gupta | Raphael Tang | Xiao Han | Ronak Pradeep | Kuang Lu | Yue Zhang | Rodrigo Nogueira | Kyunghyun Cho | Hui Fang | Jimmy Lin
Proceedings of the First Workshop on Scholarly Document Processing

We present Covidex, a search engine that exploits the latest neural ranking models to provide information access to the COVID-19 Open Research Dataset curated by the Allen Institute for AI. Our system has been online and serving users since late March 2020. The Covidex is the user application component of our three-pronged strategy to develop technologies for helping domain experts tackle the ongoing global pandemic. In addition, we provide robust and easy-to-use keyword search infrastructure that exploits mature fusion-based methods as well as standalone neural ranking models that can be incorporated into other applications. These techniques have been evaluated in the multi-round TREC-COVID challenge: Our infrastructure and baselines have been adopted by many participants, including some of the best systems. In round 3, we submitted the highest-scoring run that took advantage of previous training data and the second-highest fully automatic run. In rounds 4 and 5, we submitted the highest-scoring fully automatic runs.

pdf bib
Rapidly Deploying a Neural Search Engine for the COVID-19 Open Research Dataset
Edwin Zhang | Nikhil Gupta | Rodrigo Nogueira | Kyunghyun Cho | Jimmy Lin
Proceedings of the 1st Workshop on NLP for COVID-19 at ACL 2020

The Neural Covidex is a search engine that exploits the latest neural ranking architectures to provide information access to the COVID-19 Open Research Dataset (CORD-19) curated by the Allen Institute for AI. It exists as part of a suite of tools we have developed to help domain experts tackle the ongoing global pandemic. We hope that improved information access capabilities to the scientific literature can inform evidence-based decision making and insight generation.

2019

pdf bib
Disentangling Language and Knowledge in Task-Oriented Dialogs
Dinesh Raghu | Nikhil Gupta | Mausam
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

The Knowledge Base (KB) used for real-world applications, such as booking a movie or restaurant reservation, keeps changing over time. End-to-end neural networks trained for these task-oriented dialogs are expected to be immune to any changes in the KB. However, existing approaches breakdown when asked to handle such changes. We propose an encoder-decoder architecture (BoSsNet) with a novel Bag-of-Sequences (BoSs) memory, which facilitates the disentangled learning of the response’s language model and its knowledge incorporation. Consequently, the KB can be modified with new knowledge without a drop in interpretability. We find that BoSsNeT outperforms state-of-the-art models, with considerable improvements (>10%) on bAbI OOV test sets and other human-human datasets. We also systematically modify existing datasets to measure disentanglement and show BoSsNeT to be robust to KB modifications.