Kilian Evang


2020

pdf bib
Proceedings of the 19th International Workshop on Treebanks and Linguistic Theories
Kilian Evang | Laura Kallmeyer | Rafael Ehren | Simon Petitjean | Esther Seyffarth | Djamé Seddah
Proceedings of the 19th International Workshop on Treebanks and Linguistic Theories

pdf bib
Configurable Dependency Tree Extraction from CCG Derivations
Kilian Evang
Proceedings of the Fourth Workshop on Universal Dependencies (UDW 2020)

We revisit the problem of extracting dependency structures from the derivation structures of Combinatory Categorial Grammar (CCG). Previous approaches are often restricted to a narrow subset of CCG or support only one flavor of dependency tree. Our approach is more general and easily configurable, so that multiple styles of dependency tree can be obtained. In an initial case study, we show promising results for converting English, German, Italian, and Dutch CCG derivations from the Parallel Meaning Bank into (unlabeled) UD-style dependency trees.

2019

pdf bib
Transition-based DRS Parsing Using Stack-LSTMs
Kilian Evang
Proceedings of the IWCS Shared Task on Semantic Parsing

We present our submission to the IWCS 2019 shared task on semantic parsing, a transition-based parser that uses explicit word-meaning pairings, but no explicit representation of syntax. Parsing decisions are made based on vector representations of parser states, encoded via stack-LSTMs (Ballesteros et al., 2017), as well as some heuristic rules. Our system reaches 70.88% f-score in the competition.

pdf bib
CCGweb: a New Annotation Tool and a First Quadrilingual CCG Treebank
Kilian Evang | Lasha Abzianidze | Johan Bos
Proceedings of the 13th Linguistic Annotation Workshop

We present the first open-source graphical annotation tool for combinatory categorial grammar (CCG), and the first set of detailed guidelines for syntactic annotation with CCG, for four languages: English, German, Italian, and Dutch. We also release a parallel pilot CCG treebank based on these guidelines, with 4x100 adjudicated sentences, 10K single-annotator fully corrected sentences, and 82K single-annotator partially corrected sentences.

pdf bib
Proceedings of the 18th International Workshop on Treebanks and Linguistic Theories (TLT, SyntaxFest 2019)
Marie Candito | Kilian Evang | Stephan Oepen | Djamé Seddah
Proceedings of the 18th International Workshop on Treebanks and Linguistic Theories (TLT, SyntaxFest 2019)

pdf bib
Cross-lingual CCG Induction
Kilian Evang
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Combinatory categorial grammars are linguistically motivated and useful for semantic parsing, but costly to acquire in a supervised way and difficult to acquire in an unsupervised way. We propose an alternative making use of cross-lingual learning: an existing source-language parser is used together with a parallel corpus to induce a grammar and parsing model for a target language. On the PASCAL benchmark, cross-lingual CCG induction outperforms CCG induction from gold-standard POS tags on 3 out of 8 languages, and unsupervised CCG induction on 6 out of 8 languages. We also show that cross-lingually induced CCGs reflect syntactic properties of the target languages.

pdf bib
Proceedings of the Eighth Joint Conference on Lexical and Computational Semantics (*SEM 2019)
Rada Mihalcea | Ekaterina Shutova | Lun-Wei Ku | Kilian Evang | Soujanya Poria
Proceedings of the Eighth Joint Conference on Lexical and Computational Semantics (*SEM 2019)

2017

pdf bib
The Parallel Meaning Bank: Towards a Multilingual Corpus of Translations Annotated with Compositional Meaning Representations
Lasha Abzianidze | Johannes Bjerva | Kilian Evang | Hessel Haagsma | Rik van Noord | Pierre Ludmann | Duc-Duy Nguyen | Johan Bos
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers

The Parallel Meaning Bank is a corpus of translations annotated with shared, formal meaning representations comprising over 11 million words divided over four languages (English, German, Italian, and Dutch). Our approach is based on cross-lingual projection: automatically produced (and manually corrected) semantic annotations for English sentences are mapped onto their word-aligned translations, assuming that the translations are meaning-preserving. The semantic annotation consists of five main steps: (i) segmentation of the text in sentences and lexical items; (ii) syntactic parsing with Combinatory Categorial Grammar; (iii) universal semantic tagging; (iv) symbolization; and (v) compositional semantic analysis based on Discourse Representation Theory. These steps are performed using statistical models trained in a semi-supervised manner. The employed annotation models are all language-neutral. Our first results are promising.

pdf bib
Last Words: Sharing Is Caring: The Future of Shared Tasks
Malvina Nissim | Lasha Abzianidze | Kilian Evang | Rob van der Goot | Hessel Haagsma | Barbara Plank | Martijn Wieling
Computational Linguistics, Volume 43, Issue 4 - December 2017

pdf bib
BuzzSaw at SemEval-2017 Task 7: Global vs. Local Context for Interpreting and Locating Homographic English Puns with Sense Embeddings
Dieke Oele | Kilian Evang
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)

This paper describes our system participating in the SemEval-2017 Task 7, for the subtasks of homographic pun location and homographic pun interpretation. For pun interpretation, we use a knowledge-based Word Sense Disambiguation (WSD) method based on sense embeddings. Pun-based jokes can be divided into two parts, each containing information about the two distinct senses of the pun. To exploit this structure we split the context that is input to the WSD system into two local contexts and find the best sense for each of them. We use the output of pun interpretation for pun location. As we expect the two meanings of a pun to be very dissimilar, we compute sense embedding cosine distances for each sense-pair and select the word that has the highest distance. We describe experiments on different methods of splitting the context and compare our method to several baselines. We find evidence supporting our hypotheses and obtain competitive results for pun interpretation.

2016

pdf bib
Cross-lingual Learning of an Open-domain Semantic Parser
Kilian Evang | Johan Bos
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

We propose a method for learning semantic CCG parsers by projecting annotations via a parallel corpus. The method opens an avenue towards cheaply creating multilingual semantic parsers mapping open-domain text to formal meaning representations. A first cross-lingually learned Dutch (from English) semantic parser obtains f-scores ranging from 42.99% to 69.22% depending on the level of label informativity taken into account, compared to 58.40% to 78.88% for the underlying source-language system. These are promising numbers compared to state-of-the-art semantic parsing in open domains.

2014

pdf bib
RoBox: CCG with Structured Perceptron for Supervised Semantic Parsing of Robotic Spatial Commands
Kilian Evang | Johan Bos
Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014)

2013

pdf bib
Elephant: Sequence Labeling for Word and Sentence Segmentation
Kilian Evang | Valerio Basile | Grzegorz Chrupała | Johan Bos
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

pdf bib
Scope Disambiguation as a Tagging Task
Kilian Evang | Johan Bos
Proceedings of the 10th International Conference on Computational Semantics (IWCS 2013) – Short Papers

pdf bib
Gamification for Word Sense Labeling
Noortje J. Venhuizen | Valerio Basile | Kilian Evang | Johan Bos
Proceedings of the 10th International Conference on Computational Semantics (IWCS 2013) – Short Papers

2012

pdf bib
UGroningen: Negation detection with Discourse Representation Structures
Valerio Basile | Johan Bos | Kilian Evang | Noortje Venhuizen
*SEM 2012: The First Joint Conference on Lexical and Computational Semantics – Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation (SemEval 2012)

pdf bib
Developing a large semantically annotated corpus
Valerio Basile | Johan Bos | Kilian Evang | Noortje Venhuizen
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

What would be a good method to provide a large collection of semantically annotated texts with formal, deep semantics rather than shallow? We argue that a bootstrapping approach comprising state-of-the-art NLP tools for parsing and semantic interpretation, in combination with a wiki-like interface for collaborative annotation of experts, and a game with a purpose for crowdsourcing, are the starting ingredients for fulfilling this enterprise. The result is a semantic resource that anyone can edit and that integrates various phenomena, including predicate-argument structure, scope, tense, thematic roles, rhetorical relations and presuppositions, into a single semantic formalism: Discourse Representation Theory. Taking texts rather than sentences as the units of annotation results in deep semantic representations that incorporate discourse structure and dependencies. To manage the various (possibly conflicting) annotations provided by experts and non-experts, we introduce a method that stores ``Bits of Wisdom'' in a database as stand-off annotations.

pdf bib
A platform for collaborative semantic annotation
Valerio Basile | Johan Bos | Kilian Evang | Noortje Venhuizen
Proceedings of the Demonstrations at the 13th Conference of the European Chapter of the Association for Computational Linguistics

2011

pdf bib
PLCFRS Parsing of English Discontinuous Constituents
Kilian Evang | Laura Kallmeyer
Proceedings of the 12th International Conference on Parsing Technologies

2008

pdf bib
The Metadata-Database of a Next Generation Sustainability Web-Platform for Language Resources
Georg Rehm | Oliver Schonefeld | Andreas Witt | Timm Lehmberg | Christian Chiarcos | Hanan Bechara | Florian Eishold | Kilian Evang | Magdalena Leshtanska | Aleksandar Savkov | Matthias Stark
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

Our goal is to provide a web-based platform for the long-term preservation and distribution of a heterogeneous collection of linguistic resources. We discuss the corpus preprocessing and normalisation phase that results in sets of multi-rooted trees. At the same time we transform the original metadata records, just like the corpora annotated using different annotation approaches and exhibiting different levels of granularity, into the all-encompassing and highly flexible format eTEI for which we present editing and parsing tools. We also discuss the architecture of the sustainability platform. Its primary components are an XML database that contains corpus and metadata files and an SQL database that contains user accounts and access control lists. A staging area, whose structure, contents, and consistency can be checked using tools, is used to make sure that new resources about to be imported into the platform have the correct structure.

pdf bib
TuLiPA: Towards a Multi-Formalism Parsing Environment for Grammar Engineering
Laura Kallmeyer | Timm Lichte | Wolfgang Maier | Yannick Parmentier | Johannes Dellert | Kilian Evang
Coling 2008: Proceedings of the workshop on Grammar Engineering Across Frameworks