Katalin Pajkossy


pdf bib
The hunvec framework for NN-CRF-based sequential tagging
Katalin Pajkossy | Attila Zséder
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

In this work we present the open source hunvec framework for sequential tagging, built upon Theano and Pylearn2. The underlying statistical model, which connects linear CRF-s with neural networks, was used by Collobert and co-workers, and several other researchers. For demonstrating the flexibility of our tool, we describe a set of experiments on part-of-speech and named-entity-recognition tasks, using English and Hungarian datasets, where we modify both model and training parameters, and illustrate the usage of custom features. Model parameters we experiment with affect the vectorial word representations used by the model; we apply different word vector initializations, defined by Word2vec and GloVe embeddings and enrich the representation of words by vectors assigned trigram features. We extend training methods by using their regularized (l2 and dropout) version. When testing our framework on a Hungarian named entity corpus, we find that its performance reaches the best published results on this dataset, with no need for language-specific feature engineering. Our code is available at http://github.com/zseder/hunvec

pdf bib
Measuring Semantic Similarity of Words Using Concept Networks
Gábor Recski | Eszter Iklódi | Katalin Pajkossy | András Kornai
Proceedings of the 1st Workshop on Representation Learning for NLP


pdf bib
Competence in lexical semantics
András Kornai | Judit Ács | Márton Makrai | Dávid Márk Nemeskey | Katalin Pajkossy | Gábor Recski
Proceedings of the Fourth Joint Conference on Lexical and Computational Semantics


pdf bib
Building basic vocabulary across 40 languages
Judit Ács | Katalin Pajkossy | András Kornai
Proceedings of the Sixth Workshop on Building and Using Comparable Corpora