Lingpeng Kong


2020

pdf bib
Better Document-Level Machine Translation with Bayes’ Rule
Lei Yu | Laurent Sartran | Wojciech Stokowiec | Wang Ling | Lingpeng Kong | Phil Blunsom | Chris Dyer
Transactions of the Association for Computational Linguistics, Volume 8

We show that Bayes’ rule provides an effective mechanism for creating document translation models that can be learned from only parallel sentences and monolingual documents a compelling benefit because parallel documents are not always available. In our formulation, the posterior probability of a candidate translation is the product of the unconditional (prior) probability of the candidate output document and the “reverse translation probability” of translating the candidate output back into the source language. Our proposed model uses a powerful autoregressive language model as the prior on target language documents, but it assumes that each sentence is translated independently from the target to the source language. Crucially, at test time, when a source document is observed, the document language model prior induces dependencies between the translations of the source sentences in the posterior. The model’s independence assumption not only enables efficient use of available data, but it additionally admits a practical left-to-right beam-search algorithm for carrying out inference. Experiments show that our model benefits from using cross-sentence context in the language model, and it outperforms existing document translation approaches.

2017

pdf bib
What Do Recurrent Neural Network Grammars Learn About Syntax?
Adhiguna Kuncoro | Miguel Ballesteros | Lingpeng Kong | Chris Dyer | Graham Neubig | Noah A. Smith
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers

Recurrent neural network grammars (RNNG) are a recently proposed probablistic generative modeling family for natural language. They show state-of-the-art language modeling and parsing performance. We investigate what information they learn, from a linguistic perspective, through various ablations to the model and the data, and by augmenting the model with an attention mechanism (GA-RNNG) to enable closer inspection. We find that explicit modeling of composition is crucial for achieving the best performance. Through the attention mechanism, we find that headedness plays a central role in phrasal representation (with the model’s latent attention largely agreeing with predictions made by hand-crafted head rules, albeit with some important differences). By training grammars without nonterminal labels, we find that phrasal representations depend minimally on nonterminals, providing support for the endocentricity hypothesis.

2016

pdf bib
Distilling an Ensemble of Greedy Dependency Parsers into One MST Parser
Adhiguna Kuncoro | Miguel Ballesteros | Lingpeng Kong | Chris Dyer | Noah A. Smith
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

2015

pdf bib
Bayesian Optimization of Text Representations
Dani Yogatama | Lingpeng Kong | Noah A. Smith
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
ACBiMA: Advanced Chinese Bi-Character Word Morphological Analyzer
Ting-Hao Huang | Yun-Nung Chen | Lingpeng Kong
Proceedings of the Eighth SIGHAN Workshop on Chinese Language Processing

pdf bib
Transforming Dependencies into Phrase Structures
Lingpeng Kong | Alexander M. Rush | Noah A. Smith
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

2014

pdf bib
A Dependency Parser for Tweets
Lingpeng Kong | Nathan Schneider | Swabha Swayamdipta | Archna Bhatia | Chris Dyer | Noah A. Smith
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

pdf bib
Dependency Parsing for Weibo: An Efficient Probabilistic Logic Programming Approach
William Yang Wang | Lingpeng Kong | Kathryn Mazaitis | William W. Cohen
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)