Advances in Using Grammars with Latent Annotations for Discontinuous Parsing
Proceedings of the 16th International Conference on Parsing Technologies and the IWPT 2020 Shared Task on Parsing into Enhanced Universal Dependencies
We present new experiments that transfer techniques from Probabilistic Context-free Grammars with Latent Annotations (PCFG-LA) to two grammar formalisms for discontinuous parsing: linear context-free rewriting systems and hybrid grammars. In particular, Dirichlet priors during EM training, ensemble models, and a new nonterminal scheme for hybrid grammars are evaluated. We find that our grammars are more accurate than previous approaches based on discontinuous grammar formalisms and early instances of the discriminative models but inferior to recent discriminative parsers.
Latent Variable Grammars for Discontinuous Parsing
Proceedings of the 14th International Conference on Finite-State Methods and Natural Language Processing
Generic refinement of expressive grammar formalisms with an application to discontinuous constituent parsing
Proceedings of the 27th International Conference on Computational Linguistics
We formulate a generalization of Petrov et al. (2006)’s split/merge algorithm for interpreted regular tree grammars (Koller and Kuhlmann, 2011), which capture a large class of grammar formalisms. We evaluate its effectiveness empirically on the task of discontinuous constituent parsing with two mildly context-sensitive grammar formalisms: linear context-free rewriting systems (Vijay-Shanker et al., 1987) as well as hybrid grammars (Nederhof and Vogler, 2014).
Hybrid Grammars for Parsing of Discontinuous Phrase Structures and Non-Projective Dependency Structures
Computational Linguistics, Volume 43, Issue 3 - September 2017
We explore the concept of hybrid grammars, which formalize and generalize a range of existing frameworks for dealing with discontinuous syntactic structures. Covered are both discontinuous phrase structures and non-projective dependency structures. Technically, hybrid grammars are related to synchronous grammars, where one grammar component generates linear structures and another generates hierarchical structures. By coupling lexical elements of both components together, discontinuous structures result. Several types of hybrid grammars are characterized. We also discuss grammar induction from treebanks. The main advantage over existing frameworks is the ability of hybrid grammars to separate discontinuity of the desired structures from time complexity of parsing. This permits exploration of a large variety of parsing algorithms for discontinuous structures, with different properties. This is confirmed by the reported experimental results, which show a wide variety of running time, accuracy, and frequency of parse failures.
EM-Training for Weighted Aligned Hypergraph Bimorphisms
Proceedings of the SIGFSM Workshop on Statistical NLP and Weighted Automata
A Direct Link between Tree-Adjoining and Context-Free Tree Grammars
Proceedings of the 12th International Conference on Finite-State Methods and Natural Language Processing 2015 (FSMNLP 2015 Düsseldorf)