Chris Quirk


2020

pdf bib
Examination and Extension of Strategies for Improving Personalized Language Modeling via Interpolation
Liqun Shao | Sahitya Mantravadi | Tom Manzini | Alejandro Buendia | Manon Knoertzer | Soundar Srinivasan | Chris Quirk
Proceedings of the First Workshop on Natural Language Interfaces

In this paper, we detail novel strategies for interpolating personalized language models and methods to handle out-of-vocabulary (OOV) tokens to improve personalized language models. Using publicly available data from Reddit, we demonstrate improvements in offline metrics at the user level by interpolating a global LSTM-based authoring model with a user-personalized n-gram model. By optimizing this approach with a back-off to uniform OOV penalty and the interpolation coefficient, we observe that over 80% of users receive a lift in perplexity, with an average of 5.4% in perplexity lift per user. In doing this research we extend previous work in building NLIs and improve the robustness of metrics for downstream tasks.

2019

pdf bib
Multilingual Whispers: Generating Paraphrases with Translation
Christian Federmann | Oussama Elachqar | Chris Quirk
Proceedings of the 5th Workshop on Noisy User-generated Text (W-NUT 2019)

Naturally occurring paraphrase data, such as multiple news stories about the same event, is a useful but rare resource. This paper compares translation-based paraphrase gathering using human, automatic, or hybrid techniques to monolingual paraphrasing by experts and non-experts. We gather translations, paraphrases, and empirical human quality assessments of these approaches. Neural machine translation techniques, especially when pivoting through related languages, provide a relatively robust source of paraphrases with diversity comparable to expert human paraphrases. Surprisingly, human translators do not reliably outperform neural systems. The resulting data release will not only be a useful test set, but will also allow additional explorations in translation and paraphrase quality assessments and relationships.

pdf bib
Microsoft Icecaps: An Open-Source Toolkit for Conversation Modeling
Vighnesh Leonardo Shiv | Chris Quirk | Anshuman Suri | Xiang Gao | Khuram Shahid | Nithya Govindarajan | Yizhe Zhang | Jianfeng Gao | Michel Galley | Chris Brockett | Tulasi Menon | Bill Dolan
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: System Demonstrations

The Intelligent Conversation Engine: Code and Pre-trained Systems (Microsoft Icecaps) is an upcoming open-source natural language processing repository. Icecaps wraps TensorFlow functionality in a modular component-based architecture, presenting an intuitive and flexible paradigm for constructing sophisticated learning setups. Capabilities include multitask learning between models with shared parameters, upgraded language model decoding features, a range of built-in architectures, and a user-friendly data processing pipeline. The system is targeted toward conversational tasks, exploring diverse response generation, coherence, and knowledge grounding. Icecaps also provides pre-trained conversational models that can be either used directly or loaded for fine-tuning or bootstrapping other models; these models power an online demo of our framework.

pdf bib
Towards Content Transfer through Grounded Text Generation
Shrimai Prabhumoye | Chris Quirk | Michel Galley
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Recent work in neural generation has attracted significant interest in controlling the form of text, such as style, persona, and politeness. However, there has been less work on controlling neural text generation for content. This paper introduces the notion of Content Transfer for long-form text generation, where the task is to generate a next sentence in a document that both fits its context and is grounded in a content-rich external textual source such as a news story. Our experiments on Wikipedia data show significant improvements against competitive baselines. As another contribution of this paper, we release a benchmark dataset of 640k Wikipedia referenced sentences paired with the source articles to encourage exploration of this new task.

2018

pdf bib
Confidence Modeling for Neural Semantic Parsing
Li Dong | Chris Quirk | Mirella Lapata
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

In this work we focus on confidence modeling for neural semantic parsers which are built upon sequence-to-sequence models. We outline three major causes of uncertainty, and design various metrics to quantify these factors. These metrics are then used to estimate confidence scores that indicate whether model predictions are likely to be correct. Beyond confidence estimation, we identify which parts of the input contribute to uncertain predictions allowing users to interpret their model, and verify or refine its input. Experimental results show that our confidence model significantly outperforms a widely used method that relies on posterior probability, and improves the quality of interpretation compared to simply relying on attention scores.

pdf bib
Assigning people to tasks identified in email: The EPA dataset for addressee tagging for detected task intent
Revanth Rameshkumar | Peter Bailey | Abhishek Jha | Chris Quirk
Proceedings of the 2018 EMNLP Workshop W-NUT: The 4th Workshop on Noisy User-generated Text

We describe the Enron People Assignment (EPA) dataset, in which tasks that are described in emails are associated with the person(s) responsible for carrying out these tasks. We identify tasks and the responsible people in the Enron email dataset. We define evaluation methods for this challenge and report scores for our model and naïve baselines. The resulting model enables a user experience operating within a commercial email service: given a person and a task, it determines if the person should be notified of the task.

2017

pdf bib
NLP for Precision Medicine
Hoifung Poon | Chris Quirk | Kristina Toutanova | Wen-tau Yih
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts

We will introduce precision medicine and showcase the vast opportunities for NLP in this burgeoning field with great societal impact. We will review pressing NLP problems, state-of-the art methods, and important applications, as well as datasets, medical resources, and practical issues. The tutorial will provide an accessible overview of biomedicine, and does not presume knowledge in biology or healthcare. The ultimate goal is to reduce the entry barrier for NLP researchers to contribute to this exciting domain.

pdf bib
Distant Supervision for Relation Extraction beyond the Sentence Boundary
Chris Quirk | Hoifung Poon
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers

The growing demand for structured knowledge has led to great interest in relation extraction, especially in cases with limited supervision. However, existing distance supervision approaches only extract relations expressed in single sentences. In general, cross-sentence relation extraction is under-explored, even in the supervised-learning setting. In this paper, we propose the first approach for applying distant supervision to cross-sentence relation extraction. At the core of our approach is a graph representation that can incorporate both standard dependencies and discourse relations, thus providing a unifying way to model relations within and across sentences. We extract features from multiple paths in this graph, increasing accuracy and robustness when confronted with linguistic variation and analysis error. Experiments on an important extraction task for precision medicine show that our approach can learn an accurate cross-sentence extractor, using only a small existing knowledge base and unlabeled text from biomedical research articles. Compared to the existing distant supervision paradigm, our approach extracted twice as many relations at similar precision, thus demonstrating the prevalence of cross-sentence relations and the promise of our approach.

pdf bib
Cross-Sentence N-ary Relation Extraction with Graph LSTMs
Nanyun Peng | Hoifung Poon | Chris Quirk | Kristina Toutanova | Wen-tau Yih
Transactions of the Association for Computational Linguistics, Volume 5

Past work in relation extraction has focused on binary relations in single sentences. Recent NLP inroads in high-value domains have sparked interest in the more general setting of extracting n-ary relations that span multiple sentences. In this paper, we explore a general relation extraction framework based on graph long short-term memory networks (graph LSTMs) that can be easily extended to cross-sentence n-ary relation extraction. The graph formulation provides a unified way of exploring different LSTM approaches and incorporating various intra-sentential and inter-sentential dependencies, such as sequential, syntactic, and discourse relations. A robust contextual representation is learned for the entities, which serves as input to the relation classifier. This simplifies handling of relations with arbitrary arity, and enables multi-task learning with related relations. We evaluate this framework in two important precision medicine settings, demonstrating its effectiveness with both conventional supervised learning and distant supervision. Cross-sentence extraction produced larger knowledge bases. and multi-task learning significantly improved extraction accuracy. A thorough analysis of various LSTM approaches yielded useful insight the impact of linguistic analysis on extraction accuracy.

2016

pdf bib
Improved Semantic Parsers For If-Then Statements
I. Beltagy | Chris Quirk
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Compositional Learning of Embeddings for Relation Paths in Knowledge Base and Text
Kristina Toutanova | Victoria Lin | Wen-tau Yih | Hoifung Poon | Chris Quirk
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

2015

pdf bib
Pre-Computable Multi-Layer Neural Network Language Models
Jacob Devlin | Chris Quirk | Arul Menezes
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
A Discriminative Model for Semantics-to-String Translation
Aleš Tamchyna | Chris Quirk | Michel Galley
Proceedings of the 1st Workshop on Semantics-Driven Statistical Machine Translation (S2MT 2015)

pdf bib
An AMR parser for English, French, German, Spanish and Japanese and a new AMR-annotated corpus
Lucy Vanderwende | Arul Menezes | Chris Quirk
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations

pdf bib
Language to Code: Learning Semantic Parsers for If-This-Then-That Recipes
Chris Quirk | Raymond Mooney | Michel Galley
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

pdf bib
deltaBLEU: A Discriminative Metric for Generation Tasks with Intrinsically Diverse Targets
Michel Galley | Chris Brockett | Alessandro Sordoni | Yangfeng Ji | Michael Auli | Chris Quirk | Margaret Mitchell | Jianfeng Gao | Bill Dolan
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

2014

pdf bib
Graph-based Semi-Supervised Learning of Translation Models from Monolingual Data
Avneesh Saluja | Hany Hassan | Kristina Toutanova | Chris Quirk
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

2013

pdf bib
Joint Language and Translation Modeling with Recurrent Neural Networks
Michael Auli | Michel Galley | Chris Quirk | Geoffrey Zweig
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

pdf bib
Monolingual Marginal Matching for Translation Model Adaptation
Ann Irvine | Chris Quirk | Hal Daumé III
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

pdf bib
Regularized Minimum Error Rate Training
Michel Galley | Chris Quirk | Colin Cherry | Kristina Toutanova
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

pdf bib
Controlled Ascent: Imbuing Statistical MT with Linguistic Knowledge
William Lewis | Chris Quirk
Proceedings of the Second Workshop on Hybrid Approaches to Translation

pdf bib
Lightly Supervised Learning of Procedural Dialog Systems
Svitlana Volkova | Pallavi Choudhury | Chris Quirk | Bill Dolan | Luke Zettlemoyer
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Exact Maximum Inference for the Fertility Hidden Markov Model
Chris Quirk
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
Semantic Neighborhoods as Hypergraphs
Chris Quirk | Pallavi Choudhury
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
Beyond Left-to-Right: Multiple Decomposition Structures for SMT
Hui Zhang | Kristina Toutanova | Chris Quirk | Jianfeng Gao
Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Morphological, Syntactical and Semantic Knowledge in Statistical Machine Translation
Marta Ruiz Costa-jussà | Chris Quirk
NAACL HLT 2013 Tutorial Abstracts

2012

pdf bib
Book Review: Linguistic Structure Prediction by Noah A. Smith
Chris Quirk
Computational Linguistics, Volume 38, Issue 2 - June 2012

pdf bib
MSR SPLAT, a language analysis toolkit
Chris Quirk | Pallavi Choudhury | Jianfeng Gao | Hisami Suzuki | Kristina Toutanova | Michael Gamon | Wen-tau Yih | Colin Cherry | Lucy Vanderwende
Proceedings of the Demonstration Session at the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
On Hierarchical Re-ordering and Permutation Parsing for Phrase-based Decoding
Colin Cherry | Robert C. Moore | Chris Quirk
Proceedings of the Seventh Workshop on Statistical Machine Translation

pdf bib
Leave-One-Out Phrase Model Training for Large-Scale Deployment
Joern Wuebker | Mei-Yuh Hwang | Chris Quirk
Proceedings of the Seventh Workshop on Statistical Machine Translation

2011

pdf bib
Optimal Search for Minimum Error Rate Training
Michel Galley | Chris Quirk
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing

pdf bib
MSR-NLP Entry in BioNLP Shared Task 2011
Chris Quirk | Pallavi Choudhury | Michael Gamon | Lucy Vanderwende
Proceedings of BioNLP Shared Task 2011 Workshop

pdf bib
From pecher to pêcher... or pécher: Simplifying French Input by Accent Prediction
Pallavi Choudhury | Chris Quirk | Hisami Suzuki
Proceedings of the Workshop on Advances in Text Input Methods (WTIM 2011)

pdf bib
Gappy Phrasal Alignment By Agreement
Mohit Bansal | Chris Quirk | Robert Moore
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

2010

pdf bib
Extracting Parallel Sentences from Comparable Corpora using Document Level Alignment
Jason R. Smith | Chris Quirk | Kristina Toutanova
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics

pdf bib
A Large Scale Ranker-Based System for Search Query Spelling Correction
Jianfeng Gao | Xiaolong Li | Daniel Micol | Chris Quirk | Xu Sun
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)

pdf bib
Learning Phrase-Based Spelling Error Models from Clickthrough Data
Xu Sun | Jianfeng Gao | Daniel Micol | Chris Quirk
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics

pdf bib
Top-Down K-Best A* Parsing
Adam Pauls | Dan Klein | Chris Quirk
Proceedings of the ACL 2010 Conference Short Papers

2009

pdf bib
Less is More: Significance-Based N-gram Selection for Smaller, Better Language Models
Robert C. Moore | Chris Quirk
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

pdf bib
Improved Smoothing for N-gram Language Models Based on Ordinary Counts
Robert C. Moore | Chris Quirk
Proceedings of the ACL-IJCNLP 2009 Conference Short Papers

2008

pdf bib
Bayesian Learning of Non-Compositional Phrases with Synchronous Parsing
Hao Zhang | Chris Quirk | Robert C. Moore | Daniel Gildea
Proceedings of ACL-08: HLT

pdf bib
Random Restarts in Minimum Error Rate Training for Statistical Machine Translation
Robert C. Moore | Chris Quirk
Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)

pdf bib
Syntactic Models for Structural Word Insertion and Deletion during Translation
Arul Menezes | Chris Quirk
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing

2007

pdf bib
Using Dependency Order Templates to Improve Generality in Translation
Arul Menezes | Chris Quirk
Proceedings of the Second Workshop on Statistical Machine Translation

pdf bib
An Iteratively-Trained Segmentation-Free Phrase Translation Model for Statistical Machine Translation
Robert Moore | Chris Quirk
Proceedings of the Second Workshop on Statistical Machine Translation

2006

pdf bib
The impact of parse quality on syntactically-informed statistical machine translation
Chris Quirk | Simon Corston-Oliver
Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing

pdf bib
Microsoft Research Treelet Translation System: NAACL 2006 Europarl Evaluation
Arul Menezes | Kristina Toutanova | Chris Quirk
Proceedings on the Workshop on Statistical Machine Translation

pdf bib
Do we need phrases? Challenging the conventional wisdom in Statistical Machine Translation
Chris Quirk | Arul Menezes
Proceedings of the Human Language Technology Conference of the NAACL, Main Conference

2005

pdf bib
Dependency Treelet Translation: Syntactically Informed Phrasal SMT
Chris Quirk | Arul Menezes | Colin Cherry
Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05)

2004

pdf bib
Unsupervised Construction of Large Paraphrase Corpora: Exploiting Massively Parallel News Sources
Bill Dolan | Chris Quirk | Chris Brockett
COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics

pdf bib
Monolingual Machine Translation for Paraphrase Generation
Chris Quirk | Chris Brockett | William Dolan
Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing

2002

pdf bib
English-Japanese Example-Based Machine Translation Using Abstract Linguistic Representations
Chris Brockett | Takako Aikawa | Anthony Aue | Arul Menezes | Chris Quirk | Hisami Suzuki
COLING-02: Machine Translation in Asia