Jimmy Lin


2020

pdf bib
DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference
Ji Xin | Raphael Tang | Jaejun Lee | Yaoliang Yu | Jimmy Lin
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Large-scale pre-trained language models such as BERT have brought significant improvements to NLP applications. However, they are also notorious for being slow in inference, which makes them difficult to deploy in real-time applications. We propose a simple but effective method, DeeBERT, to accelerate BERT inference. Our approach allows samples to exit earlier without passing through the entire model. Experiments show that DeeBERT is able to save up to ~40% inference time with minimal degradation in model quality. Further analyses show different behaviors in the BERT transformer layers and also reveal their redundancy. Our work provides new ideas to efficiently apply deep transformer-based models to downstream tasks. Code is available at https://github.com/castorini/DeeBERT.

pdf bib
Showing Your Work Doesn’t Always Work
Raphael Tang | Jaejun Lee | Ji Xin | Xinyu Liu | Yaoliang Yu | Jimmy Lin
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

In natural language processing, a recently popular line of work explores how to best report the experimental results of neural networks. One exemplar publication, titled “Show Your Work: Improved Reporting of Experimental Results” (Dodge et al., 2019), advocates for reporting the expected validation effectiveness of the best-tuned model, with respect to the computational budget. In the present work, we critically examine this paper. As far as statistical generalizability is concerned, we find unspoken pitfalls and caveats with this approach. We analytically show that their estimator is biased and uses error-prone assumptions. We find that the estimator favors negative errors and yields poor bootstrapped confidence intervals. We derive an unbiased alternative and bolster our claims with empirical evidence from statistical simulation. Our codebase is at https://github.com/castorini/meanmax.

pdf bib
Two Birds, One Stone: A Simple, Unified Model for Text Generation from Structured and Unstructured Data
Hamidreza Shahidi | Ming Li | Jimmy Lin
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

A number of researchers have recently questioned the necessity of increasingly complex neural network (NN) architectures. In particular, several recent papers have shown that simpler, properly tuned models are at least competitive across several NLP tasks. In this work, we show that this is also the case for text generation from structured and unstructured data. We consider neural table-to-text generation and neural question generation (NQG) tasks for text generation from structured and unstructured data, respectively. Table-to-text generation aims to generate a description based on a given table, and NQG is the task of generating a question from a given passage where the generated question can be answered by a certain sub-span of the passage using NN models. Experimental results demonstrate that a basic attention-based seq2seq model trained with the exponential moving average technique achieves the state of the art in both tasks. Code is available at https://github.com/h-shahidi/2birds-gen.

pdf bib
Document Ranking with a Pretrained Sequence-to-Sequence Model
Rodrigo Nogueira | Zhiying Jiang | Ronak Pradeep | Jimmy Lin
Findings of the Association for Computational Linguistics: EMNLP 2020

This work proposes the use of a pretrained sequence-to-sequence model for document ranking. Our approach is fundamentally different from a commonly adopted classification-based formulation based on encoder-only pretrained transformer architectures such as BERT. We show how a sequence-to-sequence model can be trained to generate relevance labels as “target tokens”, and how the underlying logits of these target tokens can be interpreted as relevance probabilities for ranking. Experimental results on the MS MARCO passage ranking task show that our ranking approach is superior to strong encoder-only models. On three other document retrieval test collections, we demonstrate a zero-shot transfer-based approach that outperforms previous state-of-the-art models requiring in-domain cross-validation. Furthermore, we find that our approach significantly outperforms an encoder-only architecture in a data-poor setting. We investigate this observation in more detail by varying target tokens to probe the model’s use of latent knowledge. Surprisingly, we find that the choice of target tokens impacts effectiveness, even for words that are closely related semantically. This finding sheds some light on why our sequence-to-sequence formulation for document ranking is effective. Code and models are available at pygaggle.ai.

pdf bib
Cross-Lingual Training of Neural Models for Document Ranking
Peng Shi | He Bai | Jimmy Lin
Findings of the Association for Computational Linguistics: EMNLP 2020

We tackle the challenge of cross-lingual training of neural document ranking models for mono-lingual retrieval, specifically leveraging relevance judgments in English to improve search in non-English languages. Our work successfully applies multi-lingual BERT (mBERT) to document ranking and additionally compares against a number of alternatives: translating the training data, translating documents, multi-stage hybrids, and ensembles. Experiments on test collections in six different languages from diverse language families reveal many interesting findings: model-based relevance transfer using mBERT can significantly improve search quality in (non-English) mono-lingual retrieval, but other “low resource” approaches are competitive as well.

pdf bib
Inserting Information Bottlenecks for Attribution in Transformers
Zhiying Jiang | Raphael Tang | Ji Xin | Jimmy Lin
Findings of the Association for Computational Linguistics: EMNLP 2020

Pretrained transformers achieve the state of the art across tasks in natural language processing, motivating researchers to investigate their inner mechanisms. One common direction is to understand what features are important for prediction. In this paper, we apply information bottlenecks to analyze the attribution of each feature for prediction on a black-box model. We use BERT as the example and evaluate our approach both quantitatively and qualitatively. We show the effectiveness of our method in terms of attribution and the ability to provide insight into how information flows through layers. We demonstrate that our technique outperforms two competitive methods in degradation tests on four datasets. Code is available at https://github.com/bazingagin/IBA.

pdf bib
Early Exiting BERT for Efficient Document Ranking
Ji Xin | Rodrigo Nogueira | Yaoliang Yu | Jimmy Lin
Proceedings of SustaiNLP: Workshop on Simple and Efficient Natural Language Processing

Pre-trained language models such as BERT have shown their effectiveness in various tasks. Despite their power, they are known to be computationally intensive, which hinders real-world applications. In this paper, we introduce early exiting BERT for document ranking. With a slight modification, BERT becomes a model with multiple output paths, and each inference sample can exit early from these paths. In this way, computation can be effectively allocated among samples, and overall system latency is significantly reduced while the original quality is maintained. Our experiments on two document ranking datasets demonstrate up to 2.5x inference speedup with minimal quality degradation. The source code of our implementation can be found at https://github.com/castorini/earlyexiting-monobert.

pdf bib
A Little Bit Is Worse Than None: Ranking with Limited Training Data
Xinyu Zhang | Andrew Yates | Jimmy Lin
Proceedings of SustaiNLP: Workshop on Simple and Efficient Natural Language Processing

Researchers have proposed simple yet effective techniques for the retrieval problem based on using BERT as a relevance classifier to rerank initial candidates from keyword search. In this work, we tackle the challenge of fine-tuning these models for specific domains in a data and computationally efficient manner. Typically, researchers fine-tune models using corpus-specific labeled data from sources such as TREC. We first answer the question: How much data of this type do we need? Recognizing that the most computationally efficient training is no training, we explore zero-shot ranking using BERT models that have already been fine-tuned with the large MS MARCO passage retrieval dataset. We arrive at the surprising and novel finding that “some” labeled in-domain data can be worse than none at all.

pdf bib
Designing Templates for Eliciting Commonsense Knowledge from Pretrained Sequence-to-Sequence Models
Jheng-Hong Yang | Sheng-Chieh Lin | Rodrigo Nogueira | Ming-Feng Tsai | Chuan-Ju Wang | Jimmy Lin
Proceedings of the 28th International Conference on Computational Linguistics

While internalized “implicit knowledge” in pretrained transformers has led to fruitful progress in many natural language understanding tasks, how to most effectively elicit such knowledge remains an open question. Based on the text-to-text transfer transformer (T5) model, this work explores a template-based approach to extract implicit knowledge for commonsense reasoning on multiple-choice (MC) question answering tasks. Experiments on three representative MC datasets show the surprisingly good performance of our simple template, coupled with a logit normalization technique for disambiguation. Furthermore, we verify that our proposed template can be easily extended to other MC tasks with contexts such as supporting facts in open-book question answering settings. Starting from the MC task, this work initiates further research to find generic natural language templates that can effectively leverage stored knowledge in pretrained models.

pdf bib
Covidex: Neural Ranking Models and Keyword Search Infrastructure for the COVID-19 Open Research Dataset
Edwin Zhang | Nikhil Gupta | Raphael Tang | Xiao Han | Ronak Pradeep | Kuang Lu | Yue Zhang | Rodrigo Nogueira | Kyunghyun Cho | Hui Fang | Jimmy Lin
Proceedings of the First Workshop on Scholarly Document Processing

We present Covidex, a search engine that exploits the latest neural ranking models to provide information access to the COVID-19 Open Research Dataset curated by the Allen Institute for AI. Our system has been online and serving users since late March 2020. The Covidex is the user application component of our three-pronged strategy to develop technologies for helping domain experts tackle the ongoing global pandemic. In addition, we provide robust and easy-to-use keyword search infrastructure that exploits mature fusion-based methods as well as standalone neural ranking models that can be incorporated into other applications. These techniques have been evaluated in the multi-round TREC-COVID challenge: Our infrastructure and baselines have been adopted by many participants, including some of the best systems. In round 3, we submitted the highest-scoring run that took advantage of previous training data and the second-highest fully automatic run. In rounds 4 and 5, we submitted the highest-scoring fully automatic runs.

pdf bib
Cydex: Neural Search Infrastructure for the Scholarly Literature
Shane Ding | Edwin Zhang | Jimmy Lin
Proceedings of the First Workshop on Scholarly Document Processing

Cydex is a platform that provides neural search infrastructure for domain-specific scholarly literature. The platform represents an abstraction of Covidex, our recently developed full-stack open-source search engine for the COVID-19 Open Research Dataset (CORD-19) from AI2. While Covidex takes advantage of the latest best practices for keyword search using the popular Lucene search library as well as state-of-the-art neural ranking models using T5, parts of the system were hard coded to only work with CORD-19. This paper describes our efforts to generalize Covidex into Cydex, which can be applied to scholarly literature in different domains. By decoupling corpus-specific configurations from the frontend implementation, we are able to demonstrate the generality of Cydex on two very different corpora: the ACL Anthology and a collection of hydrology abstracts. Our platform is entirely open source and available at cydex.ai.

pdf bib
Rapidly Deploying a Neural Search Engine for the COVID-19 Open Research Dataset
Edwin Zhang | Nikhil Gupta | Rodrigo Nogueira | Kyunghyun Cho | Jimmy Lin
Proceedings of the 1st Workshop on NLP for COVID-19 at ACL 2020

The Neural Covidex is a search engine that exploits the latest neural ranking architectures to provide information access to the COVID-19 Open Research Dataset (CORD-19) curated by the Allen Institute for AI. It exists as part of a suite of tools we have developed to help domain experts tackle the ongoing global pandemic. We hope that improved information access capabilities to the scientific literature can inform evidence-based decision making and insight generation.

pdf bib
Howl: A Deployed, Open-Source Wake Word Detection System
Raphael Tang | Jaejun Lee | Afsaneh Razi | Julia Cambre | Ian Bicking | Jofish Kaye | Jimmy Lin
Proceedings of Second Workshop for NLP Open Source Software (NLP-OSS)

We describe Howl, an open-source wake word detection toolkit with native support for open speech datasets such as Mozilla Common Voice (MCV) and Google Speech Commands (GSC). We report benchmark results of various models supported by our toolkit on GSC and our own freely available wake word detection dataset, built from MCV. One of our models is deployed in Firefox Voice, a plugin enabling speech interactivity for the Firefox web browser. Howl represents, to the best of our knowledge, the first fully productionized, open-source wake word detection toolkit with a web browser deployment target. Our codebase is at howl.ai.

pdf bib
Exploring the Limits of Simple Learners in Knowledge Distillation for Document Classification with DocBERT
Ashutosh Adhikari | Achyudh Ram | Raphael Tang | William L. Hamilton | Jimmy Lin
Proceedings of the 5th Workshop on Representation Learning for NLP

Fine-tuned variants of BERT are able to achieve state-of-the-art accuracy on many natural language processing tasks, although at significant computational costs. In this paper, we verify BERT’s effectiveness for document classification and investigate the extent to which BERT-level effectiveness can be obtained by different baselines, combined with knowledge distillation—a popular model compression method. The results show that BERT-level effectiveness can be achieved by a single-layer LSTM with at least 40× fewer FLOPS and only ∼3\% parameters. More importantly, this study analyzes the limits of knowledge distillation as we distill BERT’s knowledge all the way down to linear models—a relevant baseline for the task. We report substantial improvement in effectiveness for even the simplest models, as they capture the knowledge learnt by BERT.

2019

pdf bib
Incorporating Contextual and Syntactic Structures Improves Semantic Similarity Modeling
Linqing Liu | Wei Yang | Jinfeng Rao | Raphael Tang | Jimmy Lin
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Semantic similarity modeling is central to many NLP problems such as natural language inference and question answering. Syntactic structures interact closely with semantics in learning compositional representations and alleviating long-range dependency issues. How-ever, such structure priors have not been well exploited in previous work for semantic mod-eling. To examine their effectiveness, we start with the Pairwise Word Interaction Model, one of the best models according to a recent reproducibility study, then introduce components for modeling context and structure using multi-layer BiLSTMs and TreeLSTMs. In addition, we introduce residual connections to the deep convolutional neural network component of the model. Extensive evaluations on eight benchmark datasets show that incorporating structural information contributes to consistent improvements over strong baselines.

pdf bib
Cross-Domain Modeling of Sentence-Level Evidence for Document Retrieval
Zeynep Akkalyoncu Yilmaz | Wei Yang | Haotian Zhang | Jimmy Lin
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

This paper applies BERT to ad hoc document retrieval on news articles, which requires addressing two challenges: relevance judgments in existing test collections are typically provided only at the document level, and documents often exceed the length that BERT was designed to handle. Our solution is to aggregate sentence-level evidence to rank documents. Furthermore, we are able to leverage passage-level relevance judgments fortuitously available in other domains to fine-tune BERT models that are able to capture cross-domain notions of relevance, and can be directly used for ranking news articles. Our simple neural ranking models achieve state-of-the-art effectiveness on three standard test collections.

pdf bib
Aligning Cross-Lingual Entities with Multi-Aspect Information
Hsiu-Wei Yang | Yanyan Zou | Peng Shi | Wei Lu | Jimmy Lin | Xu Sun
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Multilingual knowledge graphs (KGs), such as YAGO and DBpedia, represent entities in different languages. The task of cross-lingual entity alignment is to match entities in a source language with their counterparts in target languages. In this work, we investigate embedding-based approaches to encode entities from multilingual KGs into the same vector space, where equivalent entities are close to each other. Specifically, we apply graph convolutional networks (GCNs) to combine multi-aspect information of entities, including topological connections, relations, and attributes of entities, to learn entity embeddings. To exploit the literal descriptions of entities expressed in different languages, we propose two uses of a pretrained multilingual BERT model to bridge cross-lingual gaps. We further propose two strategies to integrate GCN-based and BERT-based modules to boost performance. Extensive experiments on two benchmark datasets demonstrate that our method significantly outperforms existing systems.

pdf bib
Bridging the Gap between Relevance Matching and Semantic Matching for Short Text Similarity Modeling
Jinfeng Rao | Linqing Liu | Yi Tay | Wei Yang | Peng Shi | Jimmy Lin
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

A core problem of information retrieval (IR) is relevance matching, which is to rank documents by relevance to a user’s query. On the other hand, many NLP problems, such as question answering and paraphrase identification, can be considered variants of semantic matching, which is to measure the semantic distance between two pieces of short texts. While at a high level both relevance and semantic matching require modeling textual similarity, many existing techniques for one cannot be easily adapted to the other. To bridge this gap, we propose a novel model, HCAN (Hybrid Co-Attention Network), that comprises (1) a hybrid encoder module that includes ConvNet-based and LSTM-based encoders, (2) a relevance matching module that measures soft term matches with importance weighting at multiple granularities, and (3) a semantic matching module with co-attention mechanisms that capture context-aware semantic relatedness. Evaluations on multiple IR and NLP benchmarks demonstrate state-of-the-art effectiveness compared to approaches that do not exploit pretraining on external data. Extensive ablation studies suggest that relevance and semantic matching signals are complementary across many problem settings, regardless of the choice of underlying encoders.

pdf bib
What Part of the Neural Network Does This? Understanding LSTMs by Measuring and Dissecting Neurons
Ji Xin | Jimmy Lin | Yaoliang Yu
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Memory neurons of long short-term memory (LSTM) networks encode and process information in powerful yet mysterious ways. While there has been work to analyze their behavior in carrying low-level information such as linguistic properties, how they directly contribute to label prediction remains unclear. We find inspiration from biologists and study the affinity between individual neurons and labels, propose a novel metric to quantify the sensitivity of neurons to each label, and conduct experiments to show the validity of our proposed metric. We discover that some neurons are trained to specialize on a subset of labels, and while dropping an arbitrary neuron has little effect on the overall accuracy of the model, dropping label-specialized neurons predictably and significantly degrades prediction accuracy on the associated label. We further examine the consistency of neuron-label affinity across different models. These observations provide insight into the inner mechanisms of LSTMs.

pdf bib
Applying BERT to Document Retrieval with Birch
Zeynep Akkalyoncu Yilmaz | Shengjin Wang | Wei Yang | Haotian Zhang | Jimmy Lin
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP): System Demonstrations

We present Birch, a system that applies BERT to document retrieval via integration with the open-source Anserini information retrieval toolkit to demonstrate end-to-end search over large document collections. Birch implements simple ranking models that achieve state-of-the-art effectiveness on standard TREC newswire and social media test collections. This demonstration focuses on technical challenges in the integration of NLP and IR capabilities, along with the design rationale behind our approach to tightly-coupled integration between Python (to support neural networks) and the Java Virtual Machine (to support document retrieval using the open-source Lucene search library). We demonstrate integration of Birch with an existing search interface as well as interactive notebooks that highlight its capabilities in an easy-to-understand manner.

pdf bib
Honkling: In-Browser Personalization for Ubiquitous Keyword Spotting
Jaejun Lee | Raphael Tang | Jimmy Lin
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP): System Demonstrations

Used for simple commands recognition on devices from smart speakers to mobile phones, keyword spotting systems are everywhere. Ubiquitous as well are web applications, which have grown in popularity and complexity over the last decade. However, despite their obvious advantages in natural language interaction, voice-enabled web applications are still few and far between. We attempt to bridge this gap with Honkling, a novel, JavaScript-based keyword spotting system. Purely client-side and cross-device compatible, Honkling can be deployed directly on user devices. Our in-browser implementation enables seamless personalization, which can greatly improve model quality; in the presence of underrepresented, non-American user accents, we can achieve up to an absolute 10% increase in accuracy in the personalized model with only a few examples.

pdf bib
Natural Language Generation for Effective Knowledge Distillation
Raphael Tang | Yao Lu | Jimmy Lin
Proceedings of the 2nd Workshop on Deep Learning Approaches for Low-Resource NLP (DeepLo 2019)

Knowledge distillation can effectively transfer knowledge from BERT, a deep language representation model, to traditional, shallow word embedding-based neural networks, helping them approach or exceed the quality of other heavyweight language representation models. As shown in previous work, critical to this distillation procedure is the construction of an unlabeled transfer dataset, which enables effective knowledge transfer. To create transfer set examples, we propose to sample from pretrained language models fine-tuned on task-specific text. Unlike previous techniques, this directly captures the purpose of the transfer set. We hypothesize that this principled, general approach outperforms rule-based techniques. On four datasets in sentiment classification, sentence similarity, and linguistic acceptability, we show that our approach improves upon previous methods. We outperform OpenAI GPT, a deep pretrained transformer, on three of the datasets, while using a single-layer bidirectional LSTM that runs at least ten times faster.

pdf bib
Scalable Knowledge Graph Construction from Text Collections
Ryan Clancy | Ihab F. Ilyas | Jimmy Lin
Proceedings of the Second Workshop on Fact Extraction and VERification (FEVER)

We present a scalable, open-source platform that “distills” a potentially large text collection into a knowledge graph. Our platform takes documents stored in Apache Solr and scales out the Stanford CoreNLP toolkit via Apache Spark integration to extract mentions and relations that are then ingested into the Neo4j graph database. The raw knowledge graph is then enriched with facts extracted from an external knowledge graph. The complete product can be manipulated by various applications using Neo4j’s native Cypher query language: We present a subgraph-matching approach to align extracted relations with external facts and show that fact verification, locating textual support for asserted facts, detecting inconsistent and missing facts, and extracting distantly-supervised training data can all be performed within the same framework.

pdf bib
Simple Attention-Based Representation Learning for Ranking Short Social Media Posts
Peng Shi | Jinfeng Rao | Jimmy Lin
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

This paper explores the problem of ranking short social media posts with respect to user queries using neural networks. Instead of starting with a complex architecture, we proceed from the bottom up and examine the effectiveness of a simple, word-level Siamese architecture augmented with attention-based mechanisms for capturing semantic “soft” matches between query and post tokens. Extensive experiments on datasets from the TREC Microblog Tracks show that our simple models not only achieve better effectiveness than existing approaches that are far more complex or exploit a more diverse set of relevance signals, but are also much faster.

pdf bib
Rethinking Complex Neural Network Architectures for Document Classification
Ashutosh Adhikari | Achyudh Ram | Raphael Tang | Jimmy Lin
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Neural network models for many NLP tasks have grown increasingly complex in recent years, making training and deployment more difficult. A number of recent papers have questioned the necessity of such architectures and found that well-executed, simpler models are quite effective. We show that this is also the case for document classification: in a large-scale reproducibility study of several recent neural models, we find that a simple BiLSTM architecture with appropriate regularization yields accuracy and F1 that are either competitive or exceed the state of the art on four standard benchmark datasets. Surprisingly, our simple model is able to achieve these results without attention mechanisms. While these regularization techniques, borrowed from language modeling, are not novel, to our knowledge we are the first to apply them in this context. Our work provides an open-source platform and the foundation for future work in document classification.

pdf bib
Detecting Customer Complaint Escalation with Recurrent Neural Networks and Manually-Engineered Features
Wei Yang | Luchen Tan | Chunwei Lu | Anqi Cui | Han Li | Xi Chen | Kun Xiong | Muzi Wang | Ming Li | Jian Pei | Jimmy Lin
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Industry Papers)

Consumers dissatisfied with the normal dispute resolution process provided by an e-commerce company’s customer service agents have the option of escalating their complaints by filing grievances with a government authority. This paper tackles the challenge of monitoring ongoing text chat dialogues to identify cases where the customer expresses such an intent, providing triage and prioritization for a separate pool of specialized agents specially trained to handle more complex situations. We describe a hybrid model that tackles this challenge by integrating recurrent neural networks with manually-engineered features. Experiments show that both components are complementary and contribute to overall recall, outperforming competitive baselines. A trial online deployment of our model demonstrates its business value in improving customer service.

pdf bib
End-to-End Open-Domain Question Answering with BERTserini
Wei Yang | Yuqing Xie | Aileen Lin | Xingyu Li | Luchen Tan | Kun Xiong | Ming Li | Jimmy Lin
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations)

We demonstrate an end-to-end question answering system that integrates BERT with the open-source Anserini information retrieval toolkit. In contrast to most question answering and reading comprehension models today, which operate over small amounts of input text, our system integrates best practices from IR with a BERT-based reader to identify answers from a large corpus of Wikipedia articles in an end-to-end fashion. We report large improvements over previous results on a standard benchmark test collection, showing that fine-tuning pretrained BERT with SQuAD is sufficient to achieve high accuracy in identifying answer spans.

2018

pdf bib
Strong Baselines for Simple Question Answering over Knowledge Graphs with and without Neural Networks
Salman Mohammed | Peng Shi | Jimmy Lin
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers)

We examine the problem of question answering over knowledge graphs, focusing on simple questions that can be answered by the lookup of a single fact. Adopting a straightforward decomposition of the problem into entity detection, entity linking, relation prediction, and evidence combination, we explore simple yet strong baselines. On the popular SimpleQuestions dataset, we find that basic LSTMs and GRUs plus a few heuristics yield accuracies that approach the state of the art, and techniques that do not use neural networks also perform reasonably well. These results show that gains from sophisticated deep learning techniques proposed in the literature are quite modest and that some previous models exhibit unnecessary complexity.

pdf bib
Pay-Per-Request Deployment of Neural Network Models Using Serverless Architectures
Zhucheng Tu | Mengping Li | Jimmy Lin
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations

We demonstrate the serverless deployment of neural networks for model inferencing in NLP applications using Amazon’s Lambda service for feedforward evaluation and DynamoDB for storing word embeddings. Our architecture realizes a pay-per-request pricing model, requiring zero ongoing costs for maintaining server instances. All virtual machine management is handled behind the scenes by the cloud provider without any direct developer intervention. We describe a number of techniques that allow efficient use of serverless resources, and evaluations confirm that our design is both scalable and inexpensive.

pdf bib
CNNs for NLP in the Browser: Client-Side Deployment and Visualization Opportunities
Yiyun Liang | Zhucheng Tu | Laetitia Huang | Jimmy Lin
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations

We demonstrate a JavaScript implementation of a convolutional neural network that performs feedforward inference completely in the browser. Such a deployment means that models can run completely on the client, on a wide range of devices, without making backend server requests. This design is useful for applications with stringent latency requirements or low connectivity. Our evaluations show the feasibility of JavaScript as a deployment target. Furthermore, an in-browser implementation enables seamless integration with the JavaScript ecosystem for information visualization, providing opportunities to visually inspect neural networks and better understand their inner workings.

pdf bib
Farewell Freebase: Migrating the SimpleQuestions Dataset to DBpedia
Michael Azmy | Peng Shi | Jimmy Lin | Ihab Ilyas
Proceedings of the 27th International Conference on Computational Linguistics

Question answering over knowledge graphs is an important problem of interest both commercially and academically. There is substantial interest in the class of natural language questions that can be answered via the lookup of a single fact, driven by the availability of the popular SimpleQuestions dataset. The problem with this dataset, however, is that answer triples are provided from Freebase, which has been defunct for several years. As a result, it is difficult to build “real-world” question answering systems that are operationally deployable. Furthermore, a defunct knowledge graph means that much of the infrastructure for querying, browsing, and manipulating triples no longer exists. To address this problem, we present SimpleDBpediaQA, a new benchmark dataset for simple question answering over knowledge graphs that was created by mapping SimpleQuestions entities and predicates from Freebase to DBpedia. Although this mapping is conceptually straightforward, there are a number of nuances that make the task non-trivial, owing to the different conceptual organizations of the two knowledge graphs. To lay the foundation for future research using this dataset, we leverage recent work to provide simple yet strong baselines with and without neural networks.

2017

pdf bib
An Insight Extraction System on BioMedical Literature with Deep Neural Networks
Hua He | Kris Ganjam | Navendu Jain | Jessica Lundin | Ryen White | Jimmy Lin
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

Mining biomedical text offers an opportunity to automatically discover important facts and infer associations among them. As new scientific findings appear across a large collection of biomedical publications, our aim is to tap into this literature to automate biomedical knowledge extraction and identify important insights from them. Towards that goal, we develop a system with novel deep neural networks to extract insights on biomedical literature. Evaluation shows our system is able to provide insights with competitive accuracy of human acceptance and its relation extraction component outperforms previous work.

2016

pdf bib
Pairwise Word Interaction Modeling with Deep Neural Networks for Semantic Similarity Measurement
Hua He | Jimmy Lin
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
UMD-TTIC-UW at SemEval-2016 Task 1: Attention-Based Multi-Perspective Convolutional Neural Networks for Textual Similarity Measurement
Hua He | John Wieting | Kevin Gimpel | Jinfeng Rao | Jimmy Lin
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)

2015

pdf bib
Multi-Perspective Sentence Similarity Modeling with Convolutional Neural Networks
Hua He | Kevin Gimpel | Jimmy Lin
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
Gappy Pattern Matching on GPUs for On-Demand Extraction of Hierarchical Translation Grammars
Hua He | Jimmy Lin | Adam Lopez
Transactions of the Association for Computational Linguistics, Volume 3

Grammars for machine translation can be materialized on demand by finding source phrases in an indexed parallel corpus and extracting their translations. This approach is limited in practical applications by the computational expense of online lookup and extraction. For phrase-based models, recent work has shown that on-demand grammar extraction can be greatly accelerated by parallelization on general purpose graphics processing units (GPUs), but these algorithms do not work for hierarchical models, which require matching patterns that contain gaps. We address this limitation by presenting a novel GPU algorithm for on-demand hierarchical grammar extraction that is at least an order of magnitude faster than a comparable CPU algorithm when processing large batches of sentences. In terms of end-to-end translation, with decoding on the CPU, we increase throughput by roughly two thirds on a standard MT evaluation dataset. The GPU necessary to achieve these improvements increases the cost of a server by about a third. We believe that GPU-based extraction of hierarchical grammars is an attractive proposition, particularly for MT applications that demand high throughput.

2013

pdf bib
Towards Efficient Large-Scale Feature-Rich Statistical Machine Translation
Vladimir Eidelman | Ke Wu | Ferhan Ture | Philip Resnik | Jimmy Lin
Proceedings of the Eighth Workshop on Statistical Machine Translation

pdf bib
Mr. MIRA: Open-Source Large-Margin Structured Learning on MapReduce
Vladimir Eidelman | Ke Wu | Ferhan Ture | Philip Resnik | Jimmy Lin
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics: System Demonstrations

pdf bib
Massively Parallel Suffix Array Queries and On-Demand Phrase Extraction for Statistical Machine Translation Using GPUs
Hua He | Jimmy Lin | Adam Lopez
Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
NAACL HLT 2013 Tutorial Abstracts
Jimmy Lin | Katrin Erk
NAACL HLT 2013 Tutorial Abstracts

2012

pdf bib
Why Not Grab a Free Lunch? Mining Large Corpora for Parallel Sentences to Improve Translation Modeling
Ferhan Ture | Jimmy Lin
Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Combining Statistical Translation Techniques for Cross-Language Information Retrieval
Ferhan Ture | Jimmy Lin | Douglas Oard
Proceedings of COLING 2012

2010

pdf bib
Putting the User in the Loop: Interactive Maximal Marginal Relevance for Query-Focused Summarization
Jimmy Lin | Nitin Madnani | Bonnie Dorr
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics

pdf bib
Data-Intensive Text Processing with MapReduce
Jimmy Lin | Chris Dyer
NAACL HLT 2010 Tutorial Abstracts

2009

pdf bib
Data Intensive Text Processing with MapReduce
Jimmy Lin | Chris Dyer
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Tutorial Abstracts

2008

pdf bib
Pairwise Document Similarity in Large Collections with MapReduce
Tamer Elsayed | Jimmy Lin | Douglas Oard
Proceedings of ACL-08: HLT, Short Papers

pdf bib
Proceedings of the ACL-08: HLT Demo Session
Jimmy Lin
Proceedings of the ACL-08: HLT Demo Session

pdf bib
Exploring Large-Data Issues in the Curriculum: A Case Study with MapReduce
Jimmy Lin
Proceedings of the Third Workshop on Issues in Teaching Computational Linguistics

pdf bib
Fast, Easy, and Cheap: Construction of Statistical Machine Translation Models with MapReduce
Chris Dyer | Aaron Cordova | Alex Mont | Jimmy Lin
Proceedings of the Third Workshop on Statistical Machine Translation

pdf bib
Scalable Language Processing Algorithms for the Masses: A Case Study in Computing Word Co-occurrence Matrices with MapReduce
Jimmy Lin
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing

2007

pdf bib
Is Question Answering Better than Information Retrieval? Towards a Task-Based Evaluation Framework for Question Series
Jimmy Lin
Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference

pdf bib
Concept Disambiguation for Improved Subject Access Using Multiple Knowledge Sources
Tandeep Sidhu | Judith Klavans | Jimmy Lin
Proceedings of the Workshop on Language Technology for Cultural Heritage Data (LaTeCH 2007).

pdf bib
Answering Clinical Questions with Knowledge-Based and Statistical Techniques
Dina Demner-Fushman | Jimmy Lin
Computational Linguistics, Volume 33, Number 1, March 2007

pdf bib
Different Structures for Evaluating Answers to Complex Questions: Pyramids Won’t Topple, and Neither Will Human Assessors
Hoa Trang Dang | Jimmy Lin
Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics

2006

pdf bib
Answer Extraction, Semantic Clustering, and Extractive Summarization for Clinical Question Answering
Dina Demner-Fushman | Jimmy Lin
Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics

pdf bib
Leveraging Reusability: Cost-Effective Lexical Acquisition for Large-Scale Ontology Translation
G. Craig Murray | Bonnie J. Dorr | Jimmy Lin | Jan Hajič | Pavel Pecina
Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics

pdf bib
The Role of Information Retrieval in Answering Complex Questions
Jimmy Lin
Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions

pdf bib
Situated Question Answering in the Clinical Domain: Selecting the Best Drug Treatment for Diseases
Dina Demner-Fushman | Jimmy Lin
Proceedings of the Workshop on Task-Focused Summarization and Question Answering

pdf bib
Generative Content Models for Structural Analysis of Medical Abstracts
Jimmy Lin | Damianos Karakos | Dina Demner-Fushman | Sanjeev Khudanpur
Proceedings of the HLT-NAACL BioNLP Workshop on Linking Natural Language and Biology

pdf bib
Leveraging Recurrent Phrase Structure in Large-scale Ontology Translation
G. Craig Murray | Bonnie J. Dorr | Jimmy Lin | Jan Hajič | Pavel Pecina
Proceedings of the 11th Annual conference of the European Association for Machine Translation

pdf bib
Will Pyramids Built of Nuggets Topple Over?
Jimmy Lin | Dina Demner-Fushman
Proceedings of the Human Language Technology Conference of the NAACL, Main Conference

2005

pdf bib
Automatically Evaluating Answers to Definition Questions
Jimmy Lin | Dina Demner-Fushman
Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing

pdf bib
Evaluating Summaries and Answers: Two Sides of the Same Coin?
Jimmy Lin | Dina Demner-Fushman
Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization

2004

pdf bib
Answering Definition Questions with Multiple Knowledge Sources
Wesley Hildebrandt | Boris Katz | Jimmy Lin
Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics: HLT-NAACL 2004

pdf bib
A Computational Framework for Non-Lexicalist Semantics
Jimmy Lin
Proceedings of the Student Research Workshop at HLT-NAACL 2004

pdf bib
Fine-Grained Lexical Semantic Representations and Compositionally-Derived Events in Mandarin Chinese
Jimmy Lin
Proceedings of the Computational Lexical Semantics Workshop at HLT-NAACL 2004

2003

pdf bib
Extracting Structural Paraphrases from Aligned Monolingual Corpora
Ali Ibrahim | Boris Katz | Jimmy Lin
Proceedings of the Second International Workshop on Paraphrasing

2002

pdf bib
Annotating the Semantic Web Using Natural Language
Boris Katz | Jimmy Lin
COLING-02: The 2nd Workshop on NLP and XML (NLPXML-2002)

pdf bib
The Web as a Resource for Question Answering: Perspectives and Challenges
Jimmy Lin
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02)

2001

pdf bib
Gathering Knowledge for a Question Answering System from Heterogeneous Information Sources
Boris Katz | Jimmy Lin | Sue Felshin
Proceedings of the ACL 2001 Workshop on Human Language Technology and Knowledge Management

2000

pdf bib
REXTOR: A System for Generating Relations from Natural Language
Boris Katz | Jimmy Lin
ACL-2000 Workshop on Recent Advances in Natural Language Processing and Information Retrieval