Lexicalization of Probabilistic Linear Context-free Rewriting Systems

Richard Mörbitz, Thomas Ruprecht


Abstract
In the field of constituent parsing, probabilistic grammar formalisms have been studied to model the syntactic structure of natural language. More recently, approaches utilizing neural models gained lots of traction in this field, as they achieved accurate results at high speed. We aim for a symbiosis between probabilistic linear context-free rewriting systems (PLCFRS) as a probabilistic grammar formalism and neural models to get the best of both worlds: the interpretability of grammars, and the speed and accuracy of neural models. To combine these two, we consider the approach of supertagging that requires lexicalized grammar formalisms. Here, we present a procedure which turns any PLCFRS G into an equivalent lexicalized PLCFRS G’. The derivation trees in G’ are then mapped to equivalent derivations in G. Our construction for G’ preserves the probability assignment and does not increase parsing complexity compared to G.
Anthology ID:
2020.iwpt-1.10
Volume:
Proceedings of the 16th International Conference on Parsing Technologies and the IWPT 2020 Shared Task on Parsing into Enhanced Universal Dependencies
Month:
July
Year:
2020
Address:
Online
Venues:
ACL | IWPT | WS
SIG:
SIGPARSE
Publisher:
Association for Computational Linguistics
Note:
Pages:
98–104
Language:
URL:
https://www.aclweb.org/anthology/2020.iwpt-1.10
DOI:
10.18653/v1/2020.iwpt-1.10
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/2020.iwpt-1.10.pdf
Video:
 http://slideslive.com/38929677