Lei Yu


2020

pdf bib
Inferring symmetry in natural language
Chelsea Tanchip | Lei Yu | Aotao Xu | Yang Xu
Findings of the Association for Computational Linguistics: EMNLP 2020

We present a methodological framework for inferring symmetry of verb predicates in natural language. Empirical work on predicate symmetry has taken two main approaches. The feature-based approach focuses on linguistic features pertaining to symmetry. The context-based approach denies the existence of absolute symmetry but instead argues that such inference is context dependent. We develop methods that formalize these approaches and evaluate them against a novel symmetry inference sentence (SIS) dataset comprised of 400 naturalistic usages of literature-informed verbs spanning the spectrum of symmetry-asymmetry. Our results show that a hybrid transfer learning model that integrates linguistic features with contextualized language models most faithfully predicts the empirical data. Our work integrates existing approaches to symmetry in natural language and suggests how symmetry inference can improve systematicity in state-of-the-art language models.

pdf bib
Better Document-Level Machine Translation with Bayes’ Rule
Lei Yu | Laurent Sartran | Wojciech Stokowiec | Wang Ling | Lingpeng Kong | Phil Blunsom | Chris Dyer
Transactions of the Association for Computational Linguistics, Volume 8

We show that Bayes’ rule provides an effective mechanism for creating document translation models that can be learned from only parallel sentences and monolingual documents a compelling benefit because parallel documents are not always available. In our formulation, the posterior probability of a candidate translation is the product of the unconditional (prior) probability of the candidate output document and the “reverse translation probability” of translating the candidate output back into the source language. Our proposed model uses a powerful autoregressive language model as the prior on target language documents, but it assumes that each sentence is translated independently from the target to the source language. Crucially, at test time, when a source document is observed, the document language model prior induces dependencies between the translations of the source sentences in the posterior. The model’s independence assumption not only enables efficient use of available data, but it additionally admits a practical left-to-right beam-search algorithm for carrying out inference. Experiments show that our model benefits from using cross-sentence context in the language model, and it outperforms existing document translation approaches.

2019

pdf bib
Unsupervised Recurrent Neural Network Grammars
Yoon Kim | Alexander Rush | Lei Yu | Adhiguna Kuncoro | Chris Dyer | Gábor Melis
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Recurrent neural network grammars (RNNG) are generative models of language which jointly model syntax and surface structure by incrementally generating a syntax tree and sentence in a top-down, left-to-right order. Supervised RNNGs achieve strong language modeling and parsing performance, but require an annotated corpus of parse trees. In this work, we experiment with unsupervised learning of RNNGs. Since directly marginalizing over the space of latent trees is intractable, we instead apply amortized variational inference. To maximize the evidence lower bound, we develop an inference network parameterized as a neural CRF constituency parser. On language modeling, unsupervised RNNGs perform as well their supervised counterparts on benchmarks in English and Chinese. On constituency grammar induction, they are competitive with recent neural language models that induce tree structures from words through attention mechanisms.

2016

pdf bib
Online Segment to Segment Neural Transduction
Lei Yu | Jan Buys | Phil Blunsom
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

2006

pdf bib
A Chinese Automatic Text Summarization system for mobile devices
Lei Yu | Mengge Liu | Fuji Ren | Shingo Kuroiwa
Proceedings of the 20th Pacific Asia Conference on Language, Information and Computation