Gábor Berend


2020

pdf bib
ProsperAMnet at the FinSim Task: Detecting hypernyms of financial concepts via measuring the information stored in sparse word representations
Gábor Berend | Norbert Kis-Szabó | Zsolt Szántó
Proceedings of the Second Workshop on Financial Technology and Natural Language Processing

pdf bib
Quasi-Multitask Learning: an Efficient Surrogate for Obtaining Model Ensembles
Norbert Kis-Szabó | Gábor Berend
Proceedings of SustaiNLP: Workshop on Simple and Efficient Natural Language Processing

We propose the technique of quasi-multitask learning (Q-MTL), a simple and easy to implement modification of standard multitask learning, in which the tasks to be modeled are identical. With this easy modification of a standard neural classifier we can get benefits similar to an ensemble of classifiers with a fraction of the resources required.We illustrate it through a series of sequence labeling experiments over a diverse set of languages, that applying Q-MTL consistently increases the generalization ability of the applied models. The proposed architecture can be regarded as a new regularization technique that encourages the model to develop an internal representation of the problem at hand which is beneficial to multiple output units of the classifier at the same time. Our experiments corroborate that by relying on the proposed algorithm, we can approximate the quality of an ensemble of classifiers at a fraction of computational resources required. Additionally, our results suggest that Q-MTL handles the presence of noisy training labels better than ensembles.

pdf bib
ProsperAMnet at FinCausal 2020, Task 1 & 2: Modeling causality in financial texts using multi-headed transformers
Zsolt Szántó | Gábor Berend
Proceedings of the 1st Joint Workshop on Financial Narrative Processing and MultiLing Financial Summarisation

This paper introduces our efforts at the FinCasual shared task for modeling causality in financial utterances. Our approach uses the commonly and successfully applied strategy of fine-tuning a transformer-based language model with a twist, i.e. we modified the training and inference mechanism such that our model produces multiple predictions for the same instance. By designing such a model that returns k>1 predictions at the same time, we not only obtain a more resource efficient training (as opposed to fine-tuning some pre-trained language model k independent times), but our results indicate that we are also capable of obtaining comparable or even better evaluation scores that way. We compare multiple strategies for combining the k predictions of our model. Our submissions got ranked third on both subtasks of the shared task.

pdf bib
Sparsity Makes Sense: Word Sense Disambiguation Using Sparse Contextualized Word Representations
Gábor Berend
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

In this paper, we demonstrate that by utilizing sparse word representations, it becomes possible to surpass the results of more complex task-specific models on the task of fine-grained all-words word sense disambiguation. Our proposed algorithm relies on an overcomplete set of semantic basis vectors that allows us to obtain sparse contextualized word representations. We introduce such an information theory-inspired synset representation based on the co-occurrence of word senses and non-zero coordinates for word forms which allows us to achieve an aggregated F-score of 78.8 over a combination of five standard word sense disambiguating benchmark datasets. We also demonstrate the general applicability of our proposed framework by evaluating it towards part-of-speech tagging on four different treebanks. Our results indicate a significant improvement over the application of the dense word representations.

2018

pdf bib
300-sparsans at SemEval-2018 Task 9: Hypernymy as interaction of sparse attributes
Gábor Berend | Márton Makrai | Péter Földiák
Proceedings of The 12th International Workshop on Semantic Evaluation

This paper describes 300-sparsians’s participation in SemEval-2018 Task 9: Hypernym Discovery, with a system based on sparse coding and a formal concept hierarchy obtained from word embeddings. Our system took first place in subtasks (1B) Italian (all and entities), (1C) Spanish entities, and (2B) music entities.

2017

pdf bib
Sparse Coding of Neural Word Embeddings for Multilingual Sequence Labeling
Gábor Berend
Transactions of the Association for Computational Linguistics, Volume 5

In this paper we propose and carefully evaluate a sequence labeling framework which solely utilizes sparse indicator features derived from dense distributed word representations. The proposed model obtains (near) state-of-the art performance for both part-of-speech tagging and named entity recognition for a variety of languages. Our model relies only on a few thousand sparse coding-derived features, without applying any modification of the word representations employed for the different tasks. The proposed model has favorable generalization properties as it retains over 89.8% of its average POS tagging accuracy when trained at 1.2% of the total available training data, i.e. 150 sentences per language.

pdf bib
SZTE-NLP at SemEval-2017 Task 10: A High Precision Sequence Model for Keyphrase Extraction Utilizing Sparse Coding for Feature Generation
Gábor Berend
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)

In this paper we introduce our system participating at the 2017 SemEval shared task on keyphrase extraction from scientific documents. We aimed at the creation of a keyphrase extraction approach which relies on as little external resources as possible. Without applying any hand-crafted external resources, and only utilizing a transformed version of word embeddings trained at Wikipedia, our proposed system manages to perform among the best participating systems in terms of precision.

2015

pdf bib
USZEGED: Correction Type-sensitive Normalization of English Tweets Using Efficiently Indexed n-gram Statistics
Gábor Berend | Ervin Tasnádi
Proceedings of the Workshop on Noisy User-generated Text

2014

pdf bib
SZTE-NLP: Aspect level opinion mining exploiting syntactic cues
Viktor Hangya | Gábor Berend | István Varga | Richárd Farkas
Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014)

2013

pdf bib
Keyphrase-Driven Document Visualization Tool
Gábor Berend | Richárd Farkas
The Companion Volume of the Proceedings of IJCNLP 2013: System Demonstrations

pdf bib
LFG-based Features for Noun Number and Article Grammatical Errors
Gábor Berend | Veronika Vincze | Sina Zarrieß | Richárd Farkas
Proceedings of the Seventeenth Conference on Computational Natural Language Learning: Shared Task

pdf bib
SZTE-NLP: Sentiment Detection on Twitter Messages
Viktor Hangya | Gábor Berend | Richárd Farkas
Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013)

2012

pdf bib
How to Evaluate Opinionated Keyphrase Extraction?
Gábor Berend | Veronika Vincze
Proceedings of the 3rd Workshop in Computational Approaches to Subjectivity and Sentiment Analysis

2011

pdf bib
Noun Compound and Named Entity Recognition and their Usability in Keyphrase Extraction
István Nagy T. | Gábor Berend | Veronika Vincze
Proceedings of the International Conference Recent Advances in Natural Language Processing 2011

pdf bib
Multiword Expressions and Named Entities in the Wiki50 Corpus
Veronika Vincze | István Nagy T. | Gábor Berend
Proceedings of the International Conference Recent Advances in Natural Language Processing 2011

pdf bib
Domain-Dependent Identification of Multiword Expressions
István Nagy T. | Veronika Vincze | Gábor Berend
Proceedings of the International Conference Recent Advances in Natural Language Processing 2011

pdf bib
Domain-Dependent Detection of Light Verb Constructions
István T. Nagy | Gábor Berend | György Móra | Veronika Vincze
Proceedings of the Second Student Research Workshop associated with RANLP 2011

pdf bib
Inter-domain Opinion Phrase Extraction Based on Feature Augmentation
Gábor Berend | István T. Nagy | György Móra | Veronika Vincze
Proceedings of the Second Student Research Workshop associated with RANLP 2011

pdf bib
Detecting Noun Compounds and Light Verb Constructions: a Contrastive Study
Veronika Vincze | István Nagy T. | Gábor Berend
Proceedings of the Workshop on Multiword Expressions: from Parsing and Generation to the Real World

pdf bib
Opinion Expression Mining by Exploiting Keyphrase Extraction
Gábor Berend
Proceedings of 5th International Joint Conference on Natural Language Processing

2010

pdf bib
SZTERGAK : Feature Engineering for Keyphrase Extraction
Gábor Berend | Richárd Farkas
Proceedings of the 5th International Workshop on Semantic Evaluation