John K. Pate

Also published as: John Pate, John K Pate


pdf bib
Grammar induction from (lots of) words alone
John K Pate | Mark Johnson
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Grammar induction is the task of learning syntactic structure in a setting where that structure is hidden. Grammar induction from words alone is interesting because it is similiar to the problem that a child learning a language faces. Previous work has typically assumed richer but cognitively implausible input, such as POS tag annotated data, which makes that work less relevant to human language acquisition. We show that grammar induction from words alone is in fact feasible when the model is provided with sufficient training data, and present two new streaming or mini-batch algorithms for PCFG inference that can learn from millions of words of training data. We compare the performance of these algorithms to a batch algorithm that learns from less data. The minibatch algorithms outperform the batch algorithm, showing that cheap inference with more data is better than intensive inference with less data. Additionally, we show that the harmonic initialiser, which previous work identified as essential when learning from small POS-tag annotated corpora (Klein and Manning, 2004), is not superior to a uniform initialisation.


pdf bib
A Computationally Efficient Algorithm for Learning Topical Collocation Models
Zhendong Zhao | Lan Du | Benjamin Börschinger | John K Pate | Massimiliano Ciaramita | Mark Steedman | Mark Johnson
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)


pdf bib
The Effect of Dependency Representation Scheme on Syntactic Language Modelling
Sunghwan Kim | John Pate | Mark Johnson
Proceedings of the Australasian Language Technology Association Workshop 2014

pdf bib
Syllable weight encodes mostly the same information for English word segmentation as dictionary stress
John K Pate | Mark Johnson
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)


pdf bib
Unsupervised Dependency Parsing with Acoustic Cues
John K Pate | Sharon Goldwater
Transactions of the Association for Computational Linguistics, Volume 1

Unsupervised parsing is a difficult task that infants readily perform. Progress has been made on this task using text-based models, but few computational approaches have considered how infants might benefit from acoustic cues. This paper explores the hypothesis that word duration can help with learning syntax. We describe how duration information can be incorporated into an unsupervised Bayesian dependency parser whose only other source of information is the words themselves (without punctuation or parts of speech). Our results, evaluated on both adult-directed and child-directed utterances, show that using word duration can improve parse quality relative to words-only baselines. These results support the idea that acoustic cues provide useful evidence about syntactic structure for language-learning infants, and motivate the use of word duration cues in NLP tasks with speech.


pdf bib
Unsupervised Syntactic Chunking with Acoustic Cues: Computational Models for Prosodic Bootstrapping
John Pate | Sharon Goldwater
Proceedings of the 2nd Workshop on Cognitive Modeling and Computational Linguistics