Eva Fucikova

Also published as: Eva Fučíková


2020

pdf bib
SynSemClass Linked Lexicon: Mapping Synonymy between Languages
Zdenka Uresova | Eva Fucikova | Eva Hajicova | Jan Hajic
Proceedings of the 2020 Globalex Workshop on Linked Lexicography

This paper reports on an extended version of a synonym verb class lexicon, newly called SynSemClass (formerly CzEngClass). This lexicon stores cross-lingual semantically similar verb senses in synonym classes extracted from a richly annotated parallel corpus, the Prague Czech-English Dependency Treebank. When building the lexicon, we make use of predicate-argument relations (valency) and link them to semantic roles; in addition, each entry is linked to several external lexicons of more or less “semantic” nature, namely FrameNet, WordNet, VerbNet, OntoNotes and PropBank, and Czech VALLEX. The aim is to provide a linguistic resource that can be used to compare semantic roles and their syntactic properties and features across languages within and across synonym groups (classes, or ’synsets’), as well as gold standard data for automatic NLP experiments with such synonyms, such as synonym discovery, feature mapping, etc. However, perhaps the most important goal is to eventually build an event type ontology that can be referenced and used as a human-readable and human-understandable “database” for all types of events, processes and states. While the current paper describes primarily the content of the lexicon, we are also presenting a preliminary design of a format compatible with Linked Data, on which we are hoping to get feedback during discussions at the workshop. Once the resource (in whichever form) is applied to corpus annotation, deep analysis will be possible using such combined resources as training data.

2019

pdf bib
Parallel Dependency Treebank Annotated with Interlinked Verbal Synonym Classes and Roles
Zdeňka Urešová | Eva Fučíková | Eva Hajičová | Jan Hajič
Proceedings of the 18th International Workshop on Treebanks and Linguistic Theories (TLT, SyntaxFest 2019)

2018

pdf bib
Tools for Building an Interlinked Synonym Lexicon Network
Zdeňka Urešová | Eva Fučíková | Eva Hajičová | Jan Hajič
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

pdf bib
Creating a Verb Synonym Lexicon Based on a Parallel Corpus
Zdeňka Urešová | Eva Fučíková | Eva Hajičová | Jan Hajič
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

pdf bib
Synonymy in Bilingual Context: The CzEngClass Lexicon
Zdeňka Urešová | Eva Fučíková | Eva Hajičová | Jan Hajič
Proceedings of the 27th International Conference on Computational Linguistics

This paper describes CzEngClass, a bilingual lexical resource being built to investigate verbal synonymy in bilingual context and to relate semantic roles common to one synonym class to verb arguments (verb valency). In addition, the resource is linked to existing resources with the same of a similar aim: English and Czech WordNet, FrameNet, PropBank, VerbNet (SemLink), and valency lexicons for Czech and English (PDT-Vallex, Vallex, and EngVallex). There are several goals of this work and resource: (a) to provide gold standard data for automatic experiments in the future (such as automatic discovery of synonym classes, word sense disambiguation, assignment of classes to occurrences of verbs in text, coreferential linking of verb and event arguments in text, etc.), (b) to build a core (bilingual) lexicon linked to existing resources, for comparative studies and possibly for training automatic tools, and (c) to enrich the annotation of a parallel treebank, the Prague Czech English Dependency Treebank, which so far contained valency annotation but has not linked synonymous senses of verbs together. The method used for extracting the synonym classes is a semi-automatic process with a substantial amount of manual work during filtering, role assignment to classes and individual Class members’ arguments, and linking to the external lexical resources. We present the first version with 200 classes (about 1800 verbs) and evaluate interannotator agreement using several metrics.

2016

pdf bib
Joint search in a bilingual valency lexicon and an annotated corpus
Eva Fučíková | Jan Hajič | Zdeňka Urešová
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: System Demonstrations

In this paper and the associated system demo, we present an advanced search system that allows to perform a joint search over a (bilingual) valency lexicon and a correspondingly annotated linked parallel corpus. This search tool has been developed on the basis of the Prague Czech-English Dependency Treebank, but its ideas are applicable in principle to any bilingual parallel corpus that is annotated for dependencies and valency (i.e., predicate-argument structure), and where verbs are linked to appropriate entries in an associated valency lexicon. Our online search tool consolidates more search interfaces into one, providing expanded structured search capability and a more efficient advanced way to search, allowing users to search for verb pairs, verbal argument pairs, their surface realization as recorded in the lexicon, or for their surface form actually appearing in the linked parallel corpus. The search system is currently under development, and is replacing our current search tool available at http://lindat.mff.cuni.cz/services/CzEngVallex, which could search the lexicon but the queries cannot take advantage of the underlying corpus nor use the additional surface form information from the lexicon(s). The system is available as open source.

pdf bib
Non-projectivity and valency
Zdenka Uresova | Eva Fucikova | Jan Hajic
Proceedings of the Workshop on Discontinuous Structures in Natural Language Processing

pdf bib
Enriching a Valency Lexicon by Deverbative Nouns
Eva Fučíková | Jan Hajič | Zdeňka Urešová
Proceedings of the Workshop on Grammar and Lexicon: interactions and interfaces (GramLex)

We present an attempt to automatically identify Czech deverbative nouns using several methods that use large corpora as well as existing lexical resources. The motivation for the task is to extend a verbal valency (i.e., predicate-argument) lexicon by adding nouns that share the valency properties with the base verb, assuming their properties can be derived (even if not trivially) from the underlying verb by deterministic grammatical rules. At the same time, even in inflective languages, not all deverbatives are simply created from their underlying base verb by regular lexical derivation processes. We have thus developed hybrid techniques that use both large parallel corpora and several standard lexical resources. Thanks to the use of parallel corpora, the resulting sets contain also synonyms, which the lexical derivation rules cannot get. For evaluation, we have manually created a small, 100-verb gold data since no such dataset was initially available for Czech.

2015

pdf bib
Bilingual English-Czech Valency Lexicon Linked to a Parallel Corpus
Zdeňka Urešová | Ondřej Dušek | Eva Fučíková | Jan Hajič | Jana Šindlerová
Proceedings of The 9th Linguistic Annotation Workshop

pdf bib
Using Parallel Texts and Lexicons for Verbal Word Sense Disambiguation
Ondřej Dušek | Eva Fučíková | Jan Hajič | Martin Popel | Jana Šindlerová | Zdeňka Urešová
Proceedings of the Third International Conference on Dependency Linguistics (Depling 2015)

pdf bib
Zero Alignment of Verb Arguments in a Parallel Treebank
Jana Šindlerová | Eva Fučíková | Zdeňka Urešová
Proceedings of the Third International Conference on Dependency Linguistics (Depling 2015)

2014

pdf bib
Resources in Conflict: A Bilingual Valency Lexicon vs. a Bilingual Treebank vs. a Linguistic Theory
Jana Šindlerová | Zdeňka Urešová | Eva Fucikova
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

In this paper, we would like to exemplify how a syntactically annotated bilingual treebank can help us in exploring and revising a developed linguistic theory. On the material of the Prague Czech-English Dependency Treebank we observe sentences in which an Addressee argument in one language is linked translationally to a Patient argument in the other one, and make generalizations about the theoretical grounds of the argument non-correspondences and its relations to the valency theory beyond the annotation practice. Exploring verbs of three semantic classes (Judgement verbs, Teaching verbs and Attempt Suasion verbs) we claim that the Functional Generative Description argument labelling is highly dependent on the morphosyntactic realization of the individual participants, which then results in valency frame differences. Nevertheless, most of the differences can be overcome without substantial changes to the linguistic theory itself.

2013

pdf bib
An Analysis of Annotation of Verb-Noun Idiomatic Combinations in a Parallel Dependency Corpus
Zdenka Uresova | Jan Hajic | Eva Fucikova | Jana Sindlerova
Proceedings of the 9th Workshop on Multiword Expressions

2012

pdf bib
Announcing Prague Czech-English Dependency Treebank 2.0
Jan Hajič | Eva Hajičová | Jarmila Panevová | Petr Sgall | Ondřej Bojar | Silvie Cinková | Eva Fučíková | Marie Mikulová | Petr Pajas | Jan Popelka | Jiří Semecký | Jana Šindlerová | Jan Štěpánek | Josef Toman | Zdeňka Urešová | Zdeněk Žabokrtský
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

We introduce a substantial update of the Prague Czech-English Dependency Treebank, a parallel corpus manually annotated at the deep syntactic layer of linguistic representation. The English part consists of the Wall Street Journal (WSJ) section of the Penn Treebank. The Czech part was translated from the English source sentence by sentence. This paper gives a high level overview of the underlying linguistic theory (the so-called tectogrammatical annotation) with some details of the most important features like valency annotation, ellipsis reconstruction or coreference.