Alexander Popov


2020

pdf bib
Reconstructing NER Corpora: a Case Study on Bulgarian
Iva Marinova | Laska Laskova | Petya Osenova | Kiril Simov | Alexander Popov
Proceedings of the 12th Language Resources and Evaluation Conference

The paper reports on the usage of deep learning methods for improving a Named Entity Recognition (NER) training corpus and for predicting and annotating new types in a test corpus. We show how the annotations in a type-based corpus of named entities (NE) were populated as occurrences within it, thus ensuring density of the training information. A deep learning model was adopted for discovering inconsistencies in the initial annotation and for learning new NE types. The evaluation results get improved after data curation, randomization and deduplication.

pdf bib
Implementing an End-to-End Treebank-Informed Pipeline for Bulgarian
Alexander Popov | Petya Osenova | Kiril Simov
Proceedings of the 19th International Workshop on Treebanks and Linguistic Theories

2019

pdf bib
Graph Embeddings for Frame Identification
Alexander Popov | Jennifer Sikos
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019)

Lexical resources such as WordNet (Miller, 1995) and FrameNet (Baker et al., 1998) are organized as graphs, where relationships between words are made explicit via the structure of the resource. This work explores how structural information from these lexical resources can lead to gains in a downstream task, namely frame identification. While much of the current work in frame identification uses various neural architectures to predict frames, those neural architectures only use representations of frames based on annotated corpus data. We demonstrate how incorporating knowledge directly from the FrameNet graph structure improves the performance of a neural network-based frame identification system. Specifically, we construct a bidirectional LSTM with a loss function that incorporates various graph- and corpus-based frame embeddings for learning and ultimately achieves strong performance gains with the graph-based embeddings over corpus-based embeddings alone.

pdf bib
Know Your Graph. State-of-the-Art Knowledge-Based WSD
Alexander Popov | Kiril Simov | Petya Osenova
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019)

This paper introduces several improvements over the current state of the art in knowledge-based word sense disambiguation. Those innovations are the result of modifying and enriching a knowledge base created originally on the basis of WordNet. They reflect several separate but connected strategies: manipulating the shape and the content of the knowledge base, assigning weights over the relations in the knowledge base, and the addition of new relations to it. The main contribution of the paper is to demonstrate that the previously proposed knowledge bases organize linguistic and world knowledge suboptimally for the task of word sense disambiguation. In doing so, the paper also establishes a new state of the art for knowledge-based approaches. Its best models are competitive in the broader context of supervised systems as well.

2017

pdf bib
Word Sense Disambiguation with Recurrent Neural Networks
Alexander Popov
Proceedings of the Student Research Workshop Associated with RANLP 2017

This paper presents a neural network architecture for word sense disambiguation (WSD). The architecture employs recurrent neural layers and more specifically LSTM cells, in order to capture information about word order and to easily incorporate distributed word representations (embeddings) as features, without having to use a fixed window of text. The paper demonstrates that the architecture is able to compete with the most successful supervised systems for WSD and that there is an abundance of possible improvements to take it to the current state of the art. In addition, it explores briefly the potential of combining different types of embeddings as input features; it also discusses possible ways for generating “artificial corpora” from knowledge bases – for the purpose of producing training data and in relation to possible applications of embedding lemmas and word senses in the same space.

2016

pdf bib
Towards Semantic-based Hybrid Machine Translation between Bulgarian and English
Kiril Simov | Petya Osenova | Alexander Popov
Proceedings of the 2nd Workshop on Semantics-Driven Machine Translation (SedMT 2016)

2015

pdf bib
Improving Word Sense Disambiguation with Linguistic Knowledge from a Sense Annotated Treebank
Kiril Simov | Alexander Popov | Petya Osenova
Proceedings of the International Conference Recent Advances in Natural Language Processing

pdf bib
Proceedings of the Student Research Workshop
Irina Temnikova | Ivelina Nikolova | Alexander Popov
Proceedings of the Student Research Workshop

2014

pdf bib
Joint Ensemble Model for POS Tagging and Dependency Parsing
Iliana Simova | Dimitar Vasilev | Alexander Popov | Kiril Simov | Petya Osenova
Proceedings of the First Joint Workshop on Statistical Parsing of Morphologically Rich Languages and Syntactic Analysis of Non-Canonical Languages