Daniel Cer


2020

pdf bib
Multilingual Universal Sentence Encoder for Semantic Retrieval
Yinfei Yang | Daniel Cer | Amin Ahmad | Mandy Guo | Jax Law | Noah Constant | Gustavo Hernandez Abrego | Steve Yuan | Chris Tar | Yun-hsuan Sung | Brian Strope | Ray Kurzweil
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations

We present easy-to-use retrieval focused multilingual sentence embedding models, made available on TensorFlow Hub. The models embed text from 16 languages into a shared semantic space using a multi-task trained dual-encoder that learns tied cross-lingual representations via translation bridge tasks (Chidambaram et al., 2018). The models achieve a new state-of-the-art in performance on monolingual and cross-lingual semantic retrieval (SR). Competitive performance is obtained on the related tasks of translation pair bitext retrieval (BR) and retrieval question answering (ReQA). On transfer learning tasks, our multilingual embeddings approach, and in some cases exceed, the performance of English only sentence embeddings.

2019

pdf bib
ReQA: An Evaluation for End-to-End Answer Retrieval Models
Amin Ahmad | Noah Constant | Yinfei Yang | Daniel Cer
Proceedings of the 2nd Workshop on Machine Reading for Question Answering

Popular QA benchmarks like SQuAD have driven progress on the task of identifying answer spans within a specific passage, with models now surpassing human performance. However, retrieving relevant answers from a huge corpus of documents is still a challenging problem, and places different requirements on the model architecture. There is growing interest in developing scalable answer retrieval models trained end-to-end, bypassing the typical document retrieval step. In this paper, we introduce Retrieval Question-Answering (ReQA), a benchmark for evaluating large-scale sentence-level answer retrieval models. We establish baselines using both neural encoding models as well as classical information retrieval techniques. We release our evaluation code to encourage further work on this challenging task.

pdf bib
Learning Cross-Lingual Sentence Representations via a Multi-task Dual-Encoder Model
Muthu Chidambaram | Yinfei Yang | Daniel Cer | Steve Yuan | Yunhsuan Sung | Brian Strope | Ray Kurzweil
Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019)

The scarcity of labeled training data across many languages is a significant roadblock for multilingual neural language processing. We approach the lack of in-language training data using sentence embeddings that map text written in different languages, but with similar meanings, to nearby embedding space representations. The representations are produced using a dual-encoder based model trained to maximize the representational similarity between sentence pairs drawn from parallel data. The representations are enhanced using multitask training and unsupervised monolingual corpora. The effectiveness of our multilingual sentence embeddings are assessed on a comprehensive collection of monolingual, cross-lingual, and zero-shot/few-shot learning tasks.

pdf bib
Hierarchical Document Encoder for Parallel Corpus Mining
Mandy Guo | Yinfei Yang | Keith Stevens | Daniel Cer | Heming Ge | Yun-hsuan Sung | Brian Strope | Ray Kurzweil
Proceedings of the Fourth Conference on Machine Translation (Volume 1: Research Papers)

We explore using multilingual document embeddings for nearest neighbor mining of parallel data. Three document-level representations are investigated: (i) document embeddings generated by simply averaging multilingual sentence embeddings; (ii) a neural bag-of-words (BoW) document encoding model; (iii) a hierarchical multilingual document encoder (HiDE) that builds on our sentence-level model. The results show document embeddings derived from sentence-level averaging are surprisingly effective for clean datasets, but suggest models trained hierarchically at the document-level are more effective on noisy data. Analysis experiments demonstrate our hierarchical models are very robust to variations in the underlying sentence embedding quality. Using document embeddings trained with HiDE achieves the state-of-the-art on United Nations (UN) parallel document mining, 94.9% P@1 for en-fr and 97.3% P@1 for en-es.

2018

pdf bib
Universal Sentence Encoder for English
Daniel Cer | Yinfei Yang | Sheng-yi Kong | Nan Hua | Nicole Limtiaco | Rhomni St. John | Noah Constant | Mario Guajardo-Cespedes | Steve Yuan | Chris Tar | Brian Strope | Ray Kurzweil
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing: System Demonstrations

We present easy-to-use TensorFlow Hub sentence embedding models having good task transfer performance. Model variants allow for trade-offs between accuracy and compute resources. We report the relationship between model complexity, resources, and transfer performance. Comparisons are made with baselines without transfer learning and to baselines that incorporate word-level transfer. Transfer learning using sentence-level embeddings is shown to outperform models without transfer learning and often those that use only word-level transfer. We show good transfer task performance with minimal training data and obtain encouraging results on word embedding association tests (WEAT) of model bias.

pdf bib
Learning Semantic Textual Similarity from Conversations
Yinfei Yang | Steve Yuan | Daniel Cer | Sheng-yi Kong | Noah Constant | Petr Pilar | Heming Ge | Yun-Hsuan Sung | Brian Strope | Ray Kurzweil
Proceedings of The Third Workshop on Representation Learning for NLP

We present a novel approach to learn representations for sentence-level semantic similarity using conversational data. Our method trains an unsupervised model to predict conversational responses. The resulting sentence embeddings perform well on the Semantic Textual Similarity (STS) Benchmark and SemEval 2017’s Community Question Answering (CQA) question similarity subtask. Performance is further improved by introducing multitask training, combining conversational response prediction and natural language inference. Extensive experiments show the proposed model achieves the best performance among all neural models on the STS Benchmark and is competitive with the state-of-the-art feature engineered and mixed systems for both tasks.

pdf bib
Effective Parallel Corpus Mining using Bilingual Sentence Embeddings
Mandy Guo | Qinlan Shen | Yinfei Yang | Heming Ge | Daniel Cer | Gustavo Hernandez Abrego | Keith Stevens | Noah Constant | Yun-Hsuan Sung | Brian Strope | Ray Kurzweil
Proceedings of the Third Conference on Machine Translation: Research Papers

This paper presents an effective approach for parallel corpus mining using bilingual sentence embeddings. Our embedding models are trained to produce similar representations exclusively for bilingual sentence pairs that are translations of each other. This is achieved using a novel training method that introduces hard negatives consisting of sentences that are not translations but have some degree of semantic similarity. The quality of the resulting embeddings are evaluated on parallel corpus reconstruction and by assessing machine translation systems trained on gold vs. mined sentence pairs. We find that the sentence embeddings can be used to reconstruct the United Nations Parallel Corpus (Ziemski et al., 2016) at the sentence-level with a precision of 48.9% for en-fr and 54.9% for en-es. When adapted to document-level matching, we achieve a parallel document matching accuracy that is comparable to the significantly more computationally intensive approach of Uszkoreit et al. (2010). Using reconstructed parallel data, we are able to train NMT models that perform nearly as well as models trained on the original data (within 1-2 BLEU).

2017

pdf bib
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)
Steven Bethard | Marine Carpuat | Marianna Apidianaki | Saif M. Mohammad | Daniel Cer | David Jurgens
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)

pdf bib
SemEval-2017 Task 1: Semantic Textual Similarity Multilingual and Crosslingual Focused Evaluation
Daniel Cer | Mona Diab | Eneko Agirre | Iñigo Lopez-Gazpio | Lucia Specia
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)

Semantic Textual Similarity (STS) measures the meaning similarity of sentences. Applications include machine translation (MT), summarization, generation, question answering (QA), short answer grading, semantic search, dialog and conversational systems. The STS shared task is a venue for assessing the current state-of-the-art. The 2017 task focuses on multilingual and cross-lingual pairs with one sub-track exploring MT quality estimation (MTQE) data. The task obtained strong participation from 31 teams, with 17 participating in all language tracks. We summarize performance and review a selection of well performing methods. Analysis highlights common errors, providing insight into the limitations of existing models. To support ongoing work on semantic representations, the STS Benchmark is introduced as a new shared training and evaluation set carefully selected from the corpus of English STS shared task data (2012-2017).

2016

pdf bib
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)
Steven Bethard | Marine Carpuat | Daniel Cer | David Jurgens | Preslav Nakov | Torsten Zesch
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)

pdf bib
SemEval-2016 Task 1: Semantic Textual Similarity, Monolingual and Cross-Lingual Evaluation
Eneko Agirre | Carmen Banea | Daniel Cer | Mona Diab | Aitor Gonzalez-Agirre | Rada Mihalcea | German Rigau | Janyce Wiebe
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)

2015

pdf bib
Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)
Preslav Nakov | Torsten Zesch | Daniel Cer | David Jurgens
Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)

pdf bib
SemEval-2015 Task 2: Semantic Textual Similarity, English, Spanish and Pilot on Interpretability
Eneko Agirre | Carmen Banea | Claire Cardie | Daniel Cer | Mona Diab | Aitor Gonzalez-Agirre | Weiwei Guo | Iñigo Lopez-Gazpio | Montse Maritxalar | Rada Mihalcea | German Rigau | Larraitz Uria | Janyce Wiebe
Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)

2014

pdf bib
SemEval-2014 Task 10: Multilingual Semantic Textual Similarity
Eneko Agirre | Carmen Banea | Claire Cardie | Daniel Cer | Mona Diab | Aitor Gonzalez-Agirre | Weiwei Guo | Rada Mihalcea | German Rigau | Janyce Wiebe
Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014)

pdf bib
Phrasal: A Toolkit for New Directions in Statistical Machine Translation
Spence Green | Daniel Cer | Christopher Manning
Proceedings of the Ninth Workshop on Statistical Machine Translation

pdf bib
An Empirical Comparison of Features and Tuning for Phrase-based Machine Translation
Spence Green | Daniel Cer | Christopher Manning
Proceedings of the Ninth Workshop on Statistical Machine Translation

2013

pdf bib
Bilingual Word Embeddings for Phrase-Based Machine Translation
Will Y. Zou | Richard Socher | Daniel Cer | Christopher D. Manning
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

pdf bib
Feature-Rich Phrase-based Translation: Stanford University’s Submission to the WMT 2013 Translation Task
Spence Green | Daniel Cer | Kevin Reschke | Rob Voigt | John Bauer | Sida Wang | Natalia Silveira | Julia Neidert | Christopher D. Manning
Proceedings of the Eighth Workshop on Statistical Machine Translation

pdf bib
Positive Diversity Tuning for Machine Translation System Combination
Daniel Cer | Christopher D. Manning | Dan Jurafsky
Proceedings of the Eighth Workshop on Statistical Machine Translation

pdf bib
Fast and Adaptive Online Training of Feature-Rich Translation Models
Spence Green | Sida Wang | Daniel Cer | Christopher D. Manning
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
*SEM 2013 shared task: Semantic Textual Similarity
Eneko Agirre | Daniel Cer | Mona Diab | Aitor Gonzalez-Agirre | Weiwei Guo
Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 1: Proceedings of the Main Conference and the Shared Task: Semantic Textual Similarity

2012

pdf bib
SemEval-2012 Task 6: A Pilot on Semantic Textual Similarity
Eneko Agirre | Daniel Cer | Mona Diab | Aitor Gonzalez-Agirre
*SEM 2012: The First Joint Conference on Lexical and Computational Semantics – Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation (SemEval 2012)

pdf bib
Stanford: Probabilistic Edit Distance Metrics for STS
Mengqiu Wang | Daniel Cer
*SEM 2012: The First Joint Conference on Lexical and Computational Semantics – Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation (SemEval 2012)

2010

pdf bib
The Best Lexical Metric for Phrase-Based Statistical MT System Optimization
Daniel Cer | Christopher D. Manning | Daniel Jurafsky
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics

pdf bib
Phrasal: A Statistical Machine Translation Toolkit for Exploring New Model Features
Daniel Cer | Michel Galley | Daniel Jurafsky | Christopher D. Manning
Proceedings of the NAACL HLT 2010 Demonstration Session

pdf bib
Parsing to Stanford Dependencies: Trade-offs between Speed and Accuracy
Daniel Cer | Marie-Catherine de Marneffe | Dan Jurafsky | Chris Manning
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

We investigate a number of approaches to generating Stanford Dependencies, a widely used semantically-oriented dependency representation. We examine algorithms specifically designed for dependency parsing (Nivre, Nivre Eager, Covington, Eisner, and RelEx) as well as dependencies extracted from constituent parse trees created by phrase structure parsers (Charniak, Charniak-Johnson, Bikel, Berkeley and Stanford). We found that constituent parsers systematically outperform algorithms designed specifically for dependency parsing. The most accurate method for generating dependencies is the Charniak-Johnson reranking parser, with 89% (labeled) attachment F1 score. The fastest methods are Nivre, Nivre Eager, and Covington, used with a linear classifier to make local parsing decisions, which can parse the entire Penn Treebank development set (section 22) in less than 10 seconds on an Intel Xeon E5520. However, this speed comes with a substantial drop in F1 score (about 76% for labeled attachment) compared to competing methods. By tuning how much of the search space is explored by the Charniak-Johnson parser, we are able to arrive at a balanced configuration that is both fast and nearly as good as the most accurate approaches.

2008

pdf bib
Regularization and Search for Minimum Error Rate Training
Daniel Cer | Dan Jurafsky | Christopher D. Manning
Proceedings of the Third Workshop on Statistical Machine Translation

2007

pdf bib
Learning Alignments and Leveraging Natural Logic
Nathanael Chambers | Daniel Cer | Trond Grenager | David Hall | Chloe Kiddon | Bill MacCartney | Marie-Catherine de Marneffe | Daniel Ramage | Eric Yeh | Christopher D. Manning
Proceedings of the ACL-PASCAL Workshop on Textual Entailment and Paraphrasing

2006

pdf bib
Learning to recognize features of valid textual entailments
Bill MacCartney | Trond Grenager | Marie-Catherine de Marneffe | Daniel Cer | Christopher D. Manning
Proceedings of the Human Language Technology Conference of the NAACL, Main Conference