Miloš Stanojević


2020

pdf bib
Max-Margin Incremental CCG Parsing
Miloš Stanojević | Mark Steedman
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Incremental syntactic parsing has been an active research area both for cognitive scientists trying to model human sentence processing and for NLP researchers attempting to combine incremental parsing with language modelling for ASR and MT. Most effort has been directed at designing the right transition mechanism, but less has been done to answer the question of what a probabilistic model for those transition parsers should look like. A very incremental transition mechanism of a recently proposed CCG parser when trained in straightforward locally normalised discriminative fashion produces very bad results on English CCGbank. We identify three biases as the causes of this problem: label bias, exposure bias and imbalanced probabilities bias. While known techniques for tackling these biases improve results, they still do not make the parser state of the art. Instead, we tackle all of these three biases at the same time using an improved version of beam search optimisation that minimises all beam search violations instead of minimising only the biggest violation. The new incremental parser gives better results than all previously published incremental CCG parsers, and outperforms even some widely used non-incremental CCG parsers.

pdf bib
Span-Based LCFRS-2 Parsing
Miloš Stanojević | Mark Steedman
Proceedings of the 16th International Conference on Parsing Technologies and the IWPT 2020 Shared Task on Parsing into Enhanced Universal Dependencies

The earliest models for discontinuous constituency parsers used mildly context-sensitive grammars, but the fashion has changed in recent years to grammar-less transition-based parsers that use strong neural probabilistic models to greedily predict transitions. We argue that grammar-based approaches still have something to contribute on top of what is offered by transition-based parsers. Concretely, by using a grammar formalism to restrict the space of possible trees we can use dynamic programming parsing algorithms for exact search for the most probable tree. Previous chart-based parsers for discontinuous formalisms used probabilistically weak generative models. We instead use a span-based discriminative neural model that preserves the dynamic programming properties of the chart parsers. Our parser does not use an explicit grammar, but it does use explicit grammar formalism constraints: we generate only trees that are within the LCFRS-2 formalism. These properties allow us to construct a new parsing algorithm that runs in lower worst-case time complexity of O(l nˆ4 +nˆ6), where n is the sentence length and l is the number of unique non-terminal labels. This parser is efficient in practice, provides best results among chart-based parsers, and is competitive with the best transition based parsers. We also show that the main bottleneck for further improvement in performance is in the restriction of fan-out to degree 2. We show that well-nestedness is helpful in speeding up parsing, but lowers accuracy.

2019

pdf bib
The Active-Filler Strategy in a Move-Eager Left-Corner Minimalist Grammar Parser
Tim Hunter | Miloš Stanojević | Edward Stabler
Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics

Recent psycholinguistic evidence suggests that human parsing of moved elements is ‘active’, and perhaps even ‘hyper-active’: it seems that a leftward-moved object is related to a verbal position rapidly, perhaps even before the transitivity information associated with the verb is available to the listener. This paper presents a formal, sound and complete parser for Minimalist Grammars whose search space contains branching points that we can identify as the locus of the decision to perform this kind of active gap-finding. This brings formal models of parsing into closer contact with recent psycholinguistic theorizing than was previously possible.

pdf bib
CCG Parsing Algorithm with Incremental Tree Rotation
Miloš Stanojević | Mark Steedman
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

The main obstacle to incremental sentence processing arises from right-branching constituent structures, which are present in the majority of English sentences, as well as optional constituents that adjoin on the right, such as right adjuncts and right conjuncts. In CCG, many right-branching derivations can be replaced by semantically equivalent left-branching incremental derivations. The problem of right-adjunction is more resistant to solution, and has been tackled in the past using revealing-based approaches that often rely either on the higher-order unification over lambda terms (Pareschi and Steedman,1987) or heuristics over dependency representations that do not cover the whole CCGbank (Ambati et al., 2015). We propose a new incremental parsing algorithm for CCG following the same revealing tradition of work but having a purely syntactic approach that does not depend on access to a distinct level of semantic representation. This algorithm can cover the whole CCGbank, with greater incrementality and accuracy than previous proposals.

2018

pdf bib
A Sound and Complete Left-Corner Parsing for Minimalist Grammars
Miloš Stanojević | Edward Stabler
Proceedings of the Eight Workshop on Cognitive Aspects of Computational Language Learning and Processing

This paper presents a left-corner parser for minimalist grammars. The relation between the parser and the grammar is transparent in the sense that there is a very simple 1-1 correspondence between derivations and parses. Like left-corner context-free parsers, left-corner minimalist parsers can be non-terminating when the grammar has empty left corners, so an easily computed left-corner oracle is defined to restrict the search.

2017

pdf bib
Alternative Objective Functions for Training MT Evaluation Metrics
Miloš Stanojević | Khalil Sima’an
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

MT evaluation metrics are tested for correlation with human judgments either at the sentence- or the corpus-level. Trained metrics ignore corpus-level judgments and are trained for high sentence-level correlation only. We show that training only for one objective (sentence or corpus level), can not only harm the performance on the other objective, but it can also be suboptimal for the objective being optimized. To this end we present a metric trained for corpus-level and show empirical comparison against a metric trained for sentence-level exemplifying how their performance may vary per language pair, type and level of judgment. Subsequently we propose a model trained to optimize both objectives simultaneously and show that it is far more stable than–and on average outperforms–both models on both objectives.

pdf bib
Neural Discontinuous Constituency Parsing
Miloš Stanojević | Raquel G. Alhama
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

One of the most pressing issues in discontinuous constituency transition-based parsing is that the relevant information for parsing decisions could be located in any part of the stack or the buffer. In this paper, we propose a solution to this problem by replacing the structured perceptron model with a recursive neural model that computes a global representation of the configuration, therefore allowing even the most remote parts of the configuration to influence the parsing decisions. We also provide a detailed analysis of how this representation should be built out of sub-representations of its core elements (words, trees and stack). Additionally, we investigate how different types of swap oracles influence the results. Our model is the first neural discontinuous constituency parser, and it outperforms all the previously published models on three out of four datasets while on the fourth it obtains second place by a tiny difference.

2016

pdf bib
Hierarchical Permutation Complexity for Word Order Evaluation
Miloš Stanojević | Khalil Sima’an
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Existing approaches for evaluating word order in machine translation work with metrics computed directly over a permutation of word positions in system output relative to a reference translation. However, every permutation factorizes into a permutation tree (PET) built of primal permutations, i.e., atomic units that do not factorize any further. In this paper we explore the idea that permutations factorizing into (on average) shorter primal permutations should represent simpler ordering as well. Consequently, we contribute Permutation Complexity, a class of metrics over PETs and their extension to forests, and define tight metrics, a sub-class of metrics implementing this idea. Subsequently we define example tight metrics and empirically test them in word order evaluation. Experiments on the WMT13 data sets for ten language pairs show that a tight metric is more often than not better than the baselines.

pdf bib
Universal Reordering via Linguistic Typology
Joachim Daiber | Miloš Stanojević | Khalil Sima’an
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

In this paper we explore the novel idea of building a single universal reordering model from English to a large number of target languages. To build this model we exploit typological features of word order for a large number of target languages together with source (English) syntactic features and we train this model on a single combined parallel corpus representing all (22) involved language pairs. We contribute experimental evidence for the usefulness of linguistically defined typological features for building such a model. When the universal reordering model is used for preordering followed by monotone translation (no reordering inside the decoder), our experiments show that this pipeline gives comparable or improved translation performance with a phrase-based baseline for a large number of language pairs (12 out of 22) from diverse language families.

pdf bib
Examining the Relationship between Preordering and Word Order Freedom in Machine Translation
Joachim Daiber | Miloš Stanojević | Wilker Aziz | Khalil Sima’an
Proceedings of the First Conference on Machine Translation: Volume 1, Research Papers

pdf bib
Results of the WMT16 Metrics Shared Task
Ondřej Bojar | Yvette Graham | Amir Kamran | Miloš Stanojević
Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers

pdf bib
Results of the WMT16 Tuning Shared Task
Bushra Jawaid | Amir Kamran | Miloš Stanojević | Ondřej Bojar
Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers

2015

pdf bib
Reordering Grammar Induction
Miloš Stanojević | Khalil Sima’an
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
Results of the WMT15 Metrics Shared Task
Miloš Stanojević | Amir Kamran | Philipp Koehn | Ondřej Bojar
Proceedings of the Tenth Workshop on Statistical Machine Translation

pdf bib
Results of the WMT15 Tuning Shared Task
Miloš Stanojević | Amir Kamran | Ondřej Bojar
Proceedings of the Tenth Workshop on Statistical Machine Translation

pdf bib
BEER 1.1: ILLC UvA submission to metrics and tuning task
Miloš Stanojević | Khalil Sima’an
Proceedings of the Tenth Workshop on Statistical Machine Translation

2014

pdf bib
BEER: BEtter Evaluation as Ranking
Miloš Stanojević | Khalil Sima’an
Proceedings of the Ninth Workshop on Statistical Machine Translation

pdf bib
Evaluating Word Order Recursively over Permutation-Forests
Miloš Stanojević | Khalil Sima’an
Proceedings of SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation

pdf bib
Fitting Sentence Level Translation Evaluation with Many Dense Features
Miloš Stanojević | Khalil Sima’an
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

2012

pdf bib
Selecting Data for English-to-Czech Machine Translation
Aleš Tamchyna | Petra Galuščáková | Amir Kamran | Miloš Stanojević | Ondřej Bojar
Proceedings of the Seventh Workshop on Statistical Machine Translation