Michael Collins

Also published as: Michael John Collins


2020

pdf bib
Unsupervised Cross-Lingual Part-of-Speech Tagging for Truly Low-Resource Scenarios
Ramy Eskander | Smaranda Muresan | Michael Collins
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

We describe a fully unsupervised cross-lingual transfer approach for part-of-speech (POS) tagging under a truly low resource scenario. We assume access to parallel translations between the target language and one or more source languages for which POS taggers are available. We use the Bible as parallel data in our experiments: small size, out-of-domain and covering many diverse languages. Our approach innovates in three ways: 1) a robust approach of selecting training instances via cross-lingual annotation projection that exploits best practices of unsupervised type and token constraints, word-alignment confidence and density of projected POS, 2) a Bi-LSTM architecture that uses contextualized word embeddings, affix embeddings and hierarchical Brown clusters, and 3) an evaluation on 12 diverse languages in terms of language family and morphological typology. In spite of the use of limited and out-of-domain parallel data, our experiments demonstrate significant improvements in accuracy over previous work. In addition, we show that using multi-source information, either via projection or output combination, improves the performance for most target languages.

pdf bib
TyDi QA: A Benchmark for Information-Seeking Question Answering in Typologically Diverse Languages
Jonathan H. Clark | Eunsol Choi | Michael Collins | Dan Garrette | Tom Kwiatkowski | Vitaly Nikolaev | Jennimaria Palomaki
Transactions of the Association for Computational Linguistics, Volume 8

Confidently making progress on multilingual modeling requires challenging, trustworthy evaluations. We present TyDi QA—a question answering dataset covering 11 typologically diverse languages with 204K question-answer pairs. The languages of TyDi QA are diverse with regard to their typology—the set of linguistic features each language expresses—such that we expect models performing well on this set to generalize across a large number of the world’s languages. We present a quantitative analysis of the data quality and example-level qualitative linguistic analyses of observed language phenomena that would not be found in English-only corpora. To provide a realistic information-seeking task and avoid priming effects, questions are written by people who want to know the answer, but don’t know the answer yet, and the data is collected directly in each language without the use of translation.

2019

pdf bib
Natural Questions: A Benchmark for Question Answering Research
Tom Kwiatkowski | Jennimaria Palomaki | Olivia Redfield | Michael Collins | Ankur Parikh | Chris Alberti | Danielle Epstein | Illia Polosukhin | Jacob Devlin | Kenton Lee | Kristina Toutanova | Llion Jones | Matthew Kelcey | Ming-Wei Chang | Andrew M. Dai | Jakob Uszkoreit | Quoc Le | Slav Petrov
Transactions of the Association for Computational Linguistics, Volume 7

We present the Natural Questions corpus, a question answering data set. Questions consist of real anonymized, aggregated queries issued to the Google search engine. An annotator is presented with a question along with a Wikipedia page from the top 5 search results, and annotates a long answer (typically a paragraph) and a short answer (one or more entities) if present on the page, or marks null if no long/short answer is present. The public release consists of 307,373 training examples with single annotations; 7,830 examples with 5-way annotations for development data; and a further 7,842 examples with 5-way annotated sequestered as test data. We present experiments validating quality of the data. We also describe analysis of 25-way annotations on 302 examples, giving insights into human variability on the annotation task. We introduce robust metrics for the purposes of evaluating question answering systems; demonstrate high human upper bounds on these metrics; and establish baseline results using competitive methods drawn from related literature.

pdf bib
Fusion of Detected Objects in Text for Visual Question Answering
Chris Alberti | Jeffrey Ling | Michael Collins | David Reitter
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

To advance models of multimodal context, we introduce a simple yet powerful neural architecture for data that combines vision and natural language. The “Bounding Boxes in Text Transformer” (B2T2) also leverages referential information binding words to portions of the image in a single unified architecture. B2T2 is highly effective on the Visual Commonsense Reasoning benchmark, achieving a new state-of-the-art with a 25% relative reduction in error rate compared to published baselines and obtaining the best performance to date on the public leaderboard (as of May 22, 2019). A detailed ablation analysis shows that the early integration of the visual features into the text analysis is key to the effectiveness of the new architecture. A reference implementation of our models is provided.

pdf bib
Synthetic QA Corpora Generation with Roundtrip Consistency
Chris Alberti | Daniel Andor | Emily Pitler | Jacob Devlin | Michael Collins
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

We introduce a novel method of generating synthetic question answering corpora by combining models of question generation and answer extraction, and by filtering the results to ensure roundtrip consistency. By pretraining on the resulting corpora we obtain significant improvements on SQuAD2 and NQ, establishing a new state-of-the-art on the latter. Our synthetic data generation models, for both question generation and answer extraction, can be fully reproduced by finetuning a publicly available BERT model on the extractive subsets of SQuAD2 and NQ. We also describe a more powerful variant that does full sequence-to-sequence pretraining for question generation, obtaining exact match and F1 at less than 0.1% and 0.4% from human performance on SQuAD2.

pdf bib
BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions
Christopher Clark | Kenton Lee | Ming-Wei Chang | Tom Kwiatkowski | Michael Collins | Kristina Toutanova
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

In this paper we study yes/no questions that are naturally occurring — meaning that they are generated in unprompted and unconstrained settings. We build a reading comprehension dataset, BoolQ, of such questions, and show that they are unexpectedly challenging. They often query for complex, non-factoid information, and require difficult entailment-like inference to solve. We also explore the effectiveness of a range of transfer learning baselines. We find that transferring from entailment data is more effective than transferring from paraphrase or extractive QA data, and that it, surprisingly, continues to be very beneficial even when starting from massive pre-trained language models such as BERT. Our best method trains BERT on MultiNLI and then re-trains it on our train set. It achieves 80.4% accuracy compared to 90% accuracy of human annotators (and 62% majority-baseline), leaving a significant gap for future work.

pdf bib
Low-Resource Syntactic Transfer with Unsupervised Source Reordering
Mohammad Sadegh Rasooli | Michael Collins
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

We describe a cross-lingual transfer method for dependency parsing that takes into account the problem of word order differences between source and target languages. Our model only relies on the Bible, a considerably smaller parallel data than the commonly used parallel data in transfer methods. We use the concatenation of projected trees from the Bible corpus, and the gold-standard treebanks in multiple source languages along with cross-lingual word representations. We demonstrate that reordering the source treebanks before training on them for a target language improves the accuracy of languages outside the European language family. Our experiments on 68 treebanks (38 languages) in the Universal Dependencies corpus achieve a high accuracy for all languages. Among them, our experiments on 16 treebanks of 12 non-European languages achieve an average UAS absolute improvement of 3.3% over a state-of-the-art method.

2018

pdf bib
Noise Contrastive Estimation and Negative Sampling for Conditional Models: Consistency and Statistical Efficiency
Zhuang Ma | Michael Collins
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Noise Contrastive Estimation (NCE) is a powerful parameter estimation method for log-linear models, which avoids calculation of the partition function or its derivatives at each training step, a computationally demanding step in many cases. It is closely related to negative sampling methods, now widely used in NLP. This paper considers NCE-based estimation of conditional models. Conditional models are frequently encountered in practice; however there has not been a rigorous theoretical analysis of NCE in this setting, and we will argue there are subtle but important questions when generalizing NCE to the conditional case. In particular, we analyze two variants of NCE for conditional models: one based on a classification objective, the other based on a ranking objective. We show that the ranking-based variant of NCE gives consistent parameter estimates under weaker assumptions than the classification-based method; we analyze the statistical efficiency of the ranking-based and classification-based variants of NCE; finally we describe experiments on synthetic data and language modeling showing the effectiveness and tradeoffs of both methods.

2017

pdf bib
A Polynomial-Time Dynamic Programming Algorithm for Phrase-Based Decoding with a Fixed Distortion Limit
Yin-Wen Chang | Michael Collins
Transactions of the Association for Computational Linguistics, Volume 5

Decoding of phrase-based translation models in the general case is known to be NP-complete, by a reduction from the traveling salesman problem (Knight, 1999). In practice, phrase-based systems often impose a hard distortion limit that limits the movement of phrases during translation. However, the impact on complexity after imposing such a constraint is not well studied. In this paper, we describe a dynamic programming algorithm for phrase-based decoding with a fixed distortion limit. The runtime of the algorithm is O(nd!lhd+1) where n is the sentence length, d is the distortion limit, l is a bound on the number of phrases starting at any position in the sentence, and h is related to the maximum number of target language translations for any source word. The algorithm makes use of a novel representation that gives a new perspective on decoding of phrase-based models.

pdf bib
Cross-Lingual Syntactic Transfer with Limited Resources
Mohammad Sadegh Rasooli | Michael Collins
Transactions of the Association for Computational Linguistics, Volume 5

We describe a simple but effective method for cross-lingual syntactic transfer of dependency parsers, in the scenario where a large amount of translation data is not available. This method makes use of three steps: 1) a method for deriving cross-lingual word clusters, which can then be used in a multilingual parser; 2) a method for transferring lexical information from a target language to source language treebanks; 3) a method for integrating these steps with the density-driven annotation projection method of Rasooli and Collins (2015). Experiments show improvements over the state-of-the-art in several languages used in previous work, in a setting where the only source of translation data is the Bible, a considerably smaller corpus than the Europarl corpus used in previous work. Results using the Europarl corpus as a source of translation data show additional improvements over the results of Rasooli and Collins (2015). We conclude with results on 38 datasets from the Universal Dependencies corpora.

pdf bib
Source-Side Left-to-Right or Target-Side Left-to-Right? An Empirical Comparison of Two Phrase-Based Decoding Algorithms
Yin-Wen Chang | Michael Collins
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

This paper describes an empirical study of the phrase-based decoding algorithm proposed by Chang and Collins (2017). The algorithm produces a translation by processing the source-language sentence in strictly left-to-right order, differing from commonly used approaches that build the target-language sentence in left-to-right order. Our results show that the new algorithm is competitive with Moses (Koehn et al., 2007) in terms of both speed and BLEU scores.

2016

pdf bib
Towards a Convex HMM Surrogate for Word Alignment
Andrei Simion | Michael Collins | Cliff Stein
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf bib
Globally Normalized Transition-Based Neural Networks
Daniel Andor | Chris Alberti | David Weiss | Aliaksei Severyn | Alessandro Presta | Kuzman Ganchev | Slav Petrov | Michael Collins
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Transforming Dependency Structures to Logical Forms for Semantic Parsing
Siva Reddy | Oscar Täckström | Michael Collins | Tom Kwiatkowski | Dipanjan Das | Mark Steedman | Mirella Lapata
Transactions of the Association for Computational Linguistics, Volume 4

The strongly typed syntax of grammar formalisms such as CCG, TAG, LFG and HPSG offers a synchronous framework for deriving syntactic structures and semantic logical forms. In contrast—partly due to the lack of a strong type system—dependency structures are easy to annotate and have become a widely used form of syntactic analysis for many languages. However, the lack of a type system makes a formal mechanism for deriving logical forms from dependency structures challenging. We address this by introducing a robust system based on the lambda calculus for deriving neo-Davidsonian logical forms from dependency trees. These logical forms are then used for semantic parsing of natural language to Freebase. Experiments on the Free917 and Web-Questions datasets show that our representation is superior to the original dependency trees and that it outperforms a CCG-based representation on this task. Compared to prior work, we obtain the strongest result to date on Free917 and competitive results on WebQuestions.

pdf bib
Unsupervised Part-Of-Speech Tagging with Anchor Hidden Markov Models
Karl Stratos | Michael Collins | Daniel Hsu
Transactions of the Association for Computational Linguistics, Volume 4

We tackle unsupervised part-of-speech (POS) tagging by learning hidden Markov models (HMMs) that are particularly well-suited for the problem. These HMMs, which we call anchor HMMs, assume that each tag is associated with at least one word that can have no other tag, which is a relatively benign condition for POS tagging (e.g., “the” is a word that appears only under the determiner tag). We exploit this assumption and extend the non-negative matrix factorization framework of Arora et al. (2013) to design a consistent estimator for anchor HMMs. In experiments, our algorithm is competitive with strong baselines such as the clustering method of Brown et al. (1992) and the log-linear model of Berg-Kirkpatrick et al. (2010). Furthermore, it produces an interpretable model in which hidden states are automatically lexicalized by words.

2015

pdf bib
On A Strictly Convex IBM Model 1
Andrei Simion | Michael Collins | Cliff Stein
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
Density-Driven Cross-Lingual Transfer of Dependency Parsers
Mohammad Sadegh Rasooli | Michael Collins
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
Simple Semi-Supervised POS Tagging
Karl Stratos | Michael Collins
Proceedings of the 1st Workshop on Vector Space Modeling for Natural Language Processing

pdf bib
Structured Training for Neural Network Transition-Based Parsing
David Weiss | Chris Alberti | Michael Collins | Slav Petrov
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

pdf bib
Model-based Word Embeddings from Decompositions of Count Matrices
Karl Stratos | Michael Collins | Daniel Hsu
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

2014

pdf bib
A Provably Correct Learning Algorithm for Latent-Variable PCFGs
Shay B. Cohen | Michael Collins
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
A Constrained Viterbi Relaxation for Bidirectional Word Alignment
Yin-Wen Chang | Alexander M. Rush | John DeNero | Michael Collins
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Learning Dictionaries for Named Entity Recognition using Minimal Supervision
Arvind Neelakantan | Michael Collins
Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics

pdf bib
Some Experiments with a Convex IBM Model 2
Andrei Simion | Michael Collins | Cliff Stein
Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, volume 2: Short Papers

2013

pdf bib
Optimal Beam Search for Machine Translation
Alexander Rush | Yin-Wen Chang | Michael Collins
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

pdf bib
A Convex Alternative to IBM Model 2
Andrei Simion | Michael Collins | Cliff Stein
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

pdf bib
Spectral Learning of Refinement HMMs
Karl Stratos | Alexander Rush | Shay B. Cohen | Michael Collins
Proceedings of the Seventeenth Conference on Computational Natural Language Learning

pdf bib
Experiments with Spectral Learning of Latent-Variable PCFGs
Shay B. Cohen | Karl Stratos | Michael Collins | Dean P. Foster | Lyle Ungar
Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Approximate PCFG Parsing Using Tensor Decomposition
Shay B. Cohen | Giorgio Satta | Michael Collins
Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Spectral Learning Algorithms for Natural Language Processing
Shay Cohen | Michael Collins | Dean Foster | Karl Stratos | Lyle Ungar
NAACL HLT 2013 Tutorial Abstracts

2012

pdf bib
Spectral Learning of Latent-Variable PCFGs
Shay B. Cohen | Karl Stratos | Michael Collins | Dean P. Foster | Lyle Ungar
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Spectral Dependency Parsing with Latent Variables
Paramveer Dhillon | Jordan Rodu | Michael Collins | Dean Foster | Lyle Ungar
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

pdf bib
Improved Parsing and POS Tagging Using Inter-Sentence Consistency Constraints
Alexander Rush | Roi Reichart | Michael Collins | Amir Globerson
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

2011

pdf bib
Exact Decoding of Phrase-Based Translation Models through Lagrangian Relaxation
Yin-Wen Chang | Michael Collins
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing

pdf bib
Lagrangian Relaxation for Inference in Natural Language Processing
Michael Collins
Proceedings of the 12th International Conference on Parsing Technologies

pdf bib
Exact Decoding of Syntactic Translation Models through Lagrangian Relaxation
Alexander M. Rush | Michael Collins
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Dual Decomposition for Natural Language Processing
Michael Collins | Alexander M. Rush
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts

2010

pdf bib
Efficient Third-Order Dependency Parsers
Terry Koo | Michael Collins
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics

pdf bib
On Dual Decomposition and Linear Programming Relaxations for Natural Language Processing
Alexander M. Rush | David Sontag | Michael Collins | Tommi Jaakkola
Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing

pdf bib
Dual Decomposition for Parsing with Non-Projective Head Automata
Terry Koo | Alexander M. Rush | Michael Collins | Tommi Jaakkola | David Sontag
Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing

2009

pdf bib
Non-Projective Parsing for Statistical Machine Translation
Xavier Carreras | Michael Collins
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

pdf bib
An Empirical Study of Semi-supervised Structured Conditional Models for Dependency Parsing
Jun Suzuki | Hideki Isozaki | Xavier Carreras | Michael Collins
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

pdf bib
Learning Context-Dependent Mappings from Sentences to Logical Form
Luke Zettlemoyer | Michael Collins
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP

pdf bib
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Mari Ostendorf | Michael Collins | Shri Narayanan | Douglas W. Oard | Lucy Vanderwende
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics

pdf bib
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers
Mari Ostendorf | Michael Collins | Shri Narayanan | Douglas W. Oard | Lucy Vanderwende
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers

2008

pdf bib
Simple Semi-supervised Dependency Parsing
Terry Koo | Xavier Carreras | Michael Collins
Proceedings of ACL-08: HLT

pdf bib
TAG, Dynamic Programming, and the Perceptron for Efficient, Feature-Rich Parsing
Xavier Carreras | Michael Collins | Terry Koo
CoNLL 2008: Proceedings of the Twelfth Conference on Computational Natural Language Learning

2007

pdf bib
Structured Prediction Models via the Matrix-Tree Theorem
Terry Koo | Amir Globerson | Xavier Carreras | Michael Collins
Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)

pdf bib
Online Learning of Relaxed CCG Grammars for Parsing to Logical Form
Luke Zettlemoyer | Michael Collins
Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)

pdf bib
Chinese Syntactic Reordering for Statistical Machine Translation
Chao Wang | Michael Collins | Philipp Koehn
Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)

2006

pdf bib
A Discriminative Model for Tree-to-Tree Translation
Brooke Cowan | Ivona Kuc̆erová | Michael Collins
Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing

2005

pdf bib
Discriminative Reranking for Natural Language Parsing
Michael Collins | Terry Koo
Computational Linguistics, Volume 31, Number 1, March 2005

pdf bib
Hidden-Variable Models for Discriminative Reranking
Terry Koo | Michael Collins
Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing

pdf bib
Morphology and Reranking for the Statistical Parsing of Spanish
Brooke Cowan | Michael Collins
Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing

pdf bib
Discriminative Syntactic Language Modeling for Speech Recognition
Michael Collins | Brian Roark | Murat Saraclar
Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05)

pdf bib
Clause Restructuring for Statistical Machine Translation
Michael Collins | Philipp Koehn | Ivona Kučerová
Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05)

2004

pdf bib
Discriminative Language Modeling with Conditional Random Fields and the Perceptron Algorithm
Brian Roark | Murat Saraclar | Michael Collins | Mark Johnson
Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04)

pdf bib
Incremental Parsing with the Perceptron Algorithm
Michael Collins | Brian Roark
Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04)

2003

pdf bib
Head-Driven Statistical Models for Natural Language Parsing
Michael Collins
Computational Linguistics, Volume 29, Number 4, December 2003

2002

pdf bib
Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms
Michael Collins
Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing (EMNLP 2002)

pdf bib
Reranking an n-gram supertagger
John Chen | Srinivas Bangalore | Michael Collins | Owen Rambow
Proceedings of the Sixth International Workshop on Tree Adjoining Grammar and Related Frameworks (TAG+6)

pdf bib
New Ranking Algorithms for Parsing and Tagging: Kernels over Discrete Structures, and the Voted Perceptron
Michael Collins | Nigel Duffy
Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics

pdf bib
Ranking Algorithms for Named Entity Extraction: Boosting and the VotedPerceptron
Michael Collins
Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics

2001

pdf bib
Parameter Estimation for Statistical Parsing Models: Theory and Practice of
Michael Collins
Proceedings of the Seventh International Workshop on Parsing Technologies

2000

pdf bib
Answer Extraction
Steven Abney | Michael Collins | Amit Singhal
Sixth Applied Natural Language Processing Conference

1999

pdf bib
A Statistical Parser for Czech
Michael Collins | Jan Hajic | Lance Ramshaw | Christoph Tillmann
Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics

pdf bib
Unsupervised Models for Named Entity Classification
Michael Collins | Yoram Singer
1999 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora

pdf bib
Book Reviews: Beyond Grammar: An Experience-based Theory of Language
Michael Collins
Computational Linguistics, Volume 25, Number 3, September 1999

1998

pdf bib
Semantic Tagging using a Probabilistic Context Free Grammar
Michael Collins | Scott Miller
Sixth Workshop on Very Large Corpora

1997

pdf bib
Three Generative, Lexicalised Models for Statistical Parsing
Michael Collins
35th Annual Meeting of the Association for Computational Linguistics and 8th Conference of the European Chapter of the Association for Computational Linguistics

1996

pdf bib
A New Statistical Parser Based on Bigram Lexical Dependencies
Michael John Collins
34th Annual Meeting of the Association for Computational Linguistics

1995

pdf bib
Prepositional Phrase Attachment through a Backed-off Model
Michael Collins | James Brooks
Third Workshop on Very Large Corpora