Marek Maziarz


2016

pdf bib
plWordNet 3.0 – a Comprehensive Lexical-Semantic Resource
Marek Maziarz | Maciej Piasecki | Ewa Rudnicka | Stan Szpakowicz | Paweł Kędzia
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

We have released plWordNet 3.0, a very large wordnet for Polish. In addition to what is expected in wordnets – richly interrelated synsets – it contains sentiment and emotion annotations, a large set of multi-word expressions, and a mapping onto WordNet 3.1. Part of the release is enWordNet 1.0, a substantially enlarged copy of WordNet 3.1, with material added to allow for a more complete mapping. The paper discusses the design principles of plWordNet, its content, its statistical portrait, a comparison with similar resources, and a partial list of applications.

2015

pdf bib
A Procedural Definition of Multi-word Lexical Units
Marek Maziarz | Stan Szpakowicz | Maciej Piasecki
Proceedings of the International Conference Recent Advances in Natural Language Processing

pdf bib
Extraction of the Multi-word Lexical Units in the Perspective of the Wordnet Expansion
Maciej Piasecki | Michał Wendelberger | Marek Maziarz
Proceedings of the International Conference Recent Advances in Natural Language Processing

2014

pdf bib
plWordNet as the Cornerstone of a Toolkit of Lexico-semantic Resources
Marek Maziarz | Maciej Piasecki | Ewa Rudnicka | Stan Szpakowicz
Proceedings of the Seventh Global Wordnet Conference

pdf bib
Registers in the System of Semantic Relations in plWordNet
Marek Maziarz | Maciej Piasecki | Ewa Rudnicka | Stan Szpakowicz
Proceedings of the Seventh Global Wordnet Conference

2013

pdf bib
Recognizing semantic relations within Polish noun phrase: A rule-based approach
Paweł Kędzia | Marek Maziarz
Proceedings of the International Conference Recent Advances in Natural Language Processing RANLP 2013

pdf bib
Beyond the Transfer-and-Merge Wordnet Construction: plWordNet and a Comparison with WordNet
Marek Maziarz | Maciej Piasecki | Ewa Rudnicka | Stan Szpakowicz
Proceedings of the International Conference Recent Advances in Natural Language Processing RANLP 2013

2012

pdf bib
Tools for plWordNet Development. Presentation and Perspectives
Bartosz Broda | Marek Maziarz | Maciej Piasecki
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

Building a wordnet is a serious undertaking. Fortunately, Language Technology (LT) can improve the process of wordnet construction both in terms of quality and cost. In this paper we present LT tools used during the construction of plWordNet and their influence on the lexicographer's work-flow. LT is employed in plWordNet development on every possible step: from data gathering through data analysis to data presentation. Nevertheless, every decision requires input from the lexicographer, but the quality of supporting tools is an important factor. Thus a limited evaluation of usefulness of employed tools is carried out on the basis of questionnaires.

pdf bib
Recognition of Polish Derivational Relations Based on Supervised Learning Scheme
Maciej Piasecki | Radoslaw Ramocki | Marek Maziarz
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

The paper presents construction of \emph{Derywator} -- a language tool for the recognition of Polish derivational relations. It was built on the basis of machine learning in a way following the bootstrapping approach: a limited set of derivational pairs described manually by linguists in plWordNet is used to train \emph{Derivator}. The tool is intended to be applied in semi-automated expansion of plWordNet with new instances of derivational relations. The training process is based on the construction of two transducers working in the opposite directions: one for prefixes and one for suffixes. Internal stem alternations are recognised, recorded in a form of mapping sequences and stored together with transducers. Raw results produced by \emph{Derivator} undergo next corpus-based and morphological filtering. A set of derivational relations defined in plWordNet is presented. Results of tests for different derivational relations are discussed. A problem of the necessary corpus-based semantic filtering is analysed. The presented tool depends to a very little extent on the hand-crafted knowledge for a particular language, namely only a table of possible alternations and morphological filtering rules must be exchanged and it should not take longer than a couple of working days.

pdf bib
KPWr: Towards a Free Corpus of Polish
Bartosz Broda | Michał Marcińczuk | Marek Maziarz | Adam Radziszewski | Adam Wardyński
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

This paper presents our efforts aimed at collecting and annotating a free Polish corpus. The corpus will serve for us as training and testing material for experiments with Machine Learning algorithms. As others may also benefit from the resource, we are going to release it under a Creative Commons licence, which is hoped to remove unnecessary usage restrictions, but also to facilitate reproduction of our experimental results. The corpus is being annotated with various types of linguistic entities: chunks and named entities, selected syntactic and semantic relations, word senses and anaphora. We report on the current state of the project as well as our ultimate goals.

pdf bib
A Strategy of Mapping Polish WordNet onto Princeton WordNet
Ewa Rudnicka | Marek Maziarz | Maciej Piasecki | Stan Szpakowicz
Proceedings of COLING 2012: Posters

pdf bib
IKAR: An Improved Kit for Anaphora Resolution for Polish
Bartosz Broda | Łukasz Burdka | Marek Maziarz
Proceedings of COLING 2012: Demonstration Papers