Aliaksei Severyn


2020

pdf bib
FELIX: Flexible Text Editing Through Tagging and Insertion
Jonathan Mallinson | Aliaksei Severyn | Eric Malmi | Guillermo Garrido
Findings of the Association for Computational Linguistics: EMNLP 2020

We present FELIX – a flexible text-editing approach for generation, designed to derive maximum benefit from the ideas of decoding with bi-directional contexts and self-supervised pretraining. In contrast to conventional sequenceto-sequence (seq2seq) models, FELIX is efficient in low-resource settings and fast at inference time, while being capable of modeling flexible input-output transformations. We achieve this by decomposing the text-editing task into two sub-tasks: tagging to decide on the subset of input tokens and their order in the output text and insertion to in-fill the missing tokens in the output not present in the input. The tagging model employs a novel Pointer mechanism, while the insertion model is based on a Masked Language Model (MLM). Both of these models are chosen to be non-autoregressive to guarantee faster inference. FELIX performs favourably when compared to recent text-editing methods and strong seq2seq baselines when evaluated on four NLG tasks: Sentence Fusion, Machine Translation Automatic Post-Editing, Summarization, and Text Simplification

pdf bib
Unsupervised Text Style Transfer with Padded Masked Language Models
Eric Malmi | Aliaksei Severyn | Sascha Rothe
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

We propose Masker, an unsupervised text-editing method for style transfer. To tackle cases when no parallel source–target pairs are available, we train masked language models (MLMs) for both the source and the target domain. Then we find the text spans where the two models disagree the most in terms of likelihood. This allows us to identify the source tokens to delete to transform the source text to match the style of the target domain. The deleted tokens are replaced with the target MLM, and by using a padded MLM variant, we avoid having to predetermine the number of inserted tokens. Our experiments on sentence fusion and sentiment transfer demonstrate that Masker performs competitively in a fully unsupervised setting. Moreover, in low-resource settings, it improves supervised methods’ accuracy by over 10 percentage points when pre-training them on silver training data generated by Masker.

pdf bib
Leveraging Pre-trained Checkpoints for Sequence Generation Tasks
Sascha Rothe | Shashi Narayan | Aliaksei Severyn
Transactions of the Association for Computational Linguistics, Volume 8

Unsupervised pre-training of large neural models has recently revolutionized Natural Language Processing. By warm-starting from the publicly released checkpoints, NLP practitioners have pushed the state-of-the-art on multiple benchmarks while saving significant amounts of compute time. So far the focus has been mainly on the Natural Language Understanding tasks. In this paper, we demonstrate the efficacy of pre-trained checkpoints for Sequence Generation. We developed a Transformer-based sequence-to-sequence model that is compatible with publicly available pre-trained BERT, GPT-2, and RoBERTa checkpoints and conducted an extensive empirical study on the utility of initializing our model, both encoder and decoder, with these checkpoints. Our models result in new state-of-the-art results on Machine Translation, Text Summarization, Sentence Splitting, and Sentence Fusion.

2019

pdf bib
Encode, Tag, Realize: High-Precision Text Editing
Eric Malmi | Sebastian Krause | Sascha Rothe | Daniil Mirylenka | Aliaksei Severyn
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

We propose LaserTagger - a sequence tagging approach that casts text generation as a text editing task. Target texts are reconstructed from the inputs using three main edit operations: keeping a token, deleting it, and adding a phrase before the token. To predict the edit operations, we propose a novel model, which combines a BERT encoder with an autoregressive Transformer decoder. This approach is evaluated on English text on four tasks: sentence fusion, sentence splitting, abstractive summarization, and grammar correction. LaserTagger achieves new state-of-the-art results on three of these tasks, performs comparably to a set of strong seq2seq baselines with a large number of training examples, and outperforms them when the number of examples is limited. Furthermore, we show that at inference time tagging can be more than two orders of magnitude faster than comparable seq2seq models, making it more attractive for running in a live environment.

2017

pdf bib
RelTextRank: An Open Source Framework for Building Relational Syntactic-Semantic Text Pair Representations
Kateryna Tymoshenko | Alessandro Moschitti | Massimo Nicosia | Aliaksei Severyn
Proceedings of ACL 2017, System Demonstrations

pdf bib
A Hybrid Convolutional Variational Autoencoder for Text Generation
Stanislau Semeniuta | Aliaksei Severyn | Erhardt Barth
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

In this paper we explore the effect of architectural choices on learning a variational autoencoder (VAE) for text generation. In contrast to the previously introduced VAE model for text where both the encoder and decoder are RNNs, we propose a novel hybrid architecture that blends fully feed-forward convolutional and deconvolutional components with a recurrent language model. Our architecture exhibits several attractive properties such as faster run time and convergence, ability to better handle long sequences and, more importantly, it helps to avoid the issue of the VAE collapsing to a deterministic model.

2016

pdf bib
Recurrent Dropout without Memory Loss
Stanislau Semeniuta | Aliaksei Severyn | Erhardt Barth
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

This paper presents a novel approach to recurrent neural network (RNN) regularization. Differently from the widely adopted dropout method, which is applied to forward connections of feedforward architectures or RNNs, we propose to drop neurons directly in recurrent connections in a way that does not cause loss of long-term memory. Our approach is as easy to implement and apply as the regular feed-forward dropout and we demonstrate its effectiveness for the most effective modern recurrent network – Long Short-Term Memory network. Our experiments on three NLP benchmarks show consistent improvements even when combined with conventional feed-forward dropout.

pdf bib
Globally Normalized Transition-Based Neural Networks
Daniel Andor | Chris Alberti | David Weiss | Aliaksei Severyn | Alessandro Presta | Kuzman Ganchev | Slav Petrov | Michael Collins
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

2015

pdf bib
On the Automatic Learning of Sentiment Lexicons
Aliaksei Severyn | Alessandro Moschitti
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Distributional Neural Networks for Automatic Resolution of Crossword Puzzles
Aliaksei Severyn | Massimo Nicosia | Gianni Barlacchi | Alessandro Moschitti
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

pdf bib
UNITN: Training Deep Convolutional Neural Network for Twitter Sentiment Classification
Aliaksei Severyn | Alessandro Moschitti
Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)

2014

pdf bib
Opinion Mining on YouTube
Aliaksei Severyn | Alessandro Moschitti | Olga Uryupina | Barbara Plank | Katja Filippova
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Encoding Semantic Resources in Syntactic Structures for Passage Reranking
Kateryna Tymoshenko | Alessandro Moschitti | Aliaksei Severyn
Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics

pdf bib
SenTube: A Corpus for Sentiment Analysis on YouTube Social Media
Olga Uryupina | Barbara Plank | Aliaksei Severyn | Agata Rotondi | Alessandro Moschitti
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

In this paper we present SenTube -- a dataset of user-generated comments on YouTube videos annotated for information content and sentiment polarity. It contains annotations that allow to develop classifiers for several important NLP tasks: (i) sentiment analysis, (ii) text categorization (relatedness of a comment to video and/or product), (iii) spam detection, and (iv) prediction of comment informativeness. The SenTube corpus favors the development of research on indexing and searching YouTube videos exploiting information derived from comments. The corpus will cover several languages: at the moment, we focus on English and Italian, with Spanish and Dutch parts scheduled for the later stages of the project. For all the languages, we collect videos for the same set of products, thus offering possibilities for multi- and cross-lingual experiments. The paper provides annotation guidelines, corpus statistics and annotator agreement details.

2013

pdf bib
Automatic Feature Engineering for Answer Selection and Extraction
Aliaksei Severyn | Alessandro Moschitti
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

pdf bib
Learning Adaptable Patterns for Passage Reranking
Aliaksei Severyn | Massimo Nicosia | Alessandro Moschitti
Proceedings of the Seventeenth Conference on Computational Natural Language Learning

pdf bib
Learning Semantic Textual Similarity with Structural Representations
Aliaksei Severyn | Massimo Nicosia | Alessandro Moschitti
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
iKernels-Core: Tree Kernel Learning for Textual Similarity
Aliaksei Severyn | Massimo Nicosia | Alessandro Moschitti
Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 1: Proceedings of the Main Conference and the Shared Task: Semantic Textual Similarity