Katja Filippova


2020

pdf bib
Controlled Hallucinations: Learning to Generate Faithfully from Noisy Data
Katja Filippova
Findings of the Association for Computational Linguistics: EMNLP 2020

Neural text generation (data- or text-to-text) demonstrates remarkable performance when training data is abundant which for many applications is not the case. To collect a large corpus of parallel data, heuristic rules are often used but they inevitably let noise into the data, such as phrases in the output which cannot be explained by the input. Consequently, models pick up on the noise and may hallucinate–generate fluent but unsupported text. Our contribution is a simple but powerful technique to treat such hallucinations as a controllable aspect of the generated text, without dismissing any input and without modifying the model architecture. On the WikiBio corpus (Lebret et al., 2016), a particularly noisy dataset, we demonstrate the efficacy of the technique both in an automatic and in a human evaluation.

pdf bib
The elephant in the interpretability room: Why use attention as explanation when we have saliency methods?
Jasmijn Bastings | Katja Filippova
Proceedings of the Third BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP

There is a recent surge of interest in using attention as explanation of model predictions, with mixed evidence on whether attention can be used as such. While attention conveniently gives us one weight per input token and is easily extracted, it is often unclear toward what goal it is used as explanation. We find that often that goal, whether explicitly stated or not, is to find out what input tokens are the most relevant to a prediction, and that the implied user for the explanation is a model developer. For this goal and user, we argue that input saliency methods are better suited, and that there are no compelling reasons to use attention, despite the coincidence that it provides a weight for each input. With this position paper, we hope to shift some of the recent focus on attention to saliency methods, and for authors to clearly state the goal and user for their explanations.

2018

pdf bib
Sentence-Level Fluency Evaluation: References Help, But Can Be Spared!
Katharina Kann | Sascha Rothe | Katja Filippova
Proceedings of the 22nd Conference on Computational Natural Language Learning

Motivated by recent findings on the probabilistic modeling of acceptability judgments, we propose syntactic log-odds ratio (SLOR), a normalized language model score, as a metric for referenceless fluency evaluation of natural language generation output at the sentence level. We further introduce WPSLOR, a novel WordPiece-based version, which harnesses a more compact language model. Even though word-overlap metrics like ROUGE are computed with the help of hand-written references, our referenceless methods obtain a significantly higher correlation with human fluency scores on a benchmark dataset of compressed sentences. Finally, we present ROUGE-LM, a reference-based metric which is a natural extension of WPSLOR to the case of available references. We show that ROUGE-LM yields a significantly higher correlation with human judgments than all baseline metrics, including WPSLOR on its own.

2015

pdf bib
Sentence Compression by Deletion with LSTMs
Katja Filippova | Enrique Alfonseca | Carlos A. Colmenares | Lukasz Kaiser | Oriol Vinyals
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
Idest: Learning a Distributed Representation for Event Patterns
Sebastian Krause | Enrique Alfonseca | Katja Filippova | Daniele Pighin
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

2014

pdf bib
Modelling Events through Memory-based, Open-IE Patterns for Abstractive Summarization
Daniele Pighin | Marco Cornolti | Enrique Alfonseca | Katja Filippova
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Opinion Mining on YouTube
Aliaksei Severyn | Alessandro Moschitti | Olga Uryupina | Barbara Plank | Katja Filippova
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

2013

pdf bib
Overcoming the Lack of Parallel Data in Sentence Compression
Katja Filippova | Yasemin Altun
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

2012

pdf bib
Pattern Learning for Relation Extraction with a Hierarchical Topic Model
Enrique Alfonseca | Katja Filippova | Jean-Yves Delort | Guillermo Garrido
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
User Demographics and Language in an Implicit Social Network
Katja Filippova
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

2011

pdf bib
Proceedings of the Workshop on Monolingual Text-To-Text Generation
Katja Filippova | Stephen Wan
Proceedings of the Workshop on Monolingual Text-To-Text Generation

2010

pdf bib
Multi-Sentence Compression: Finding Shortest Paths in Word Graphs
Katja Filippova
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)

2009

pdf bib
Company-Oriented Extractive Summarization of Financial News
Katja Filippova | Mihai Surdeanu | Massimiliano Ciaramita | Hugo Zaragoza
Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009)

pdf bib
Tree Linearization in English: Improving Language Model Based Approaches
Katja Filippova | Michael Strube
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers

2008

pdf bib
Dependency Tree Based Sentence Compression
Katja Filippova | Michael Strube
Proceedings of the Fifth International Natural Language Generation Conference

pdf bib
Sentence Fusion via Dependency Graph Compression
Katja Filippova | Michael Strube
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing

2007

pdf bib
Extending the Entity-grid Coherence Model to Semantically Related Entities
Katja Filippova | Michael Strube
Proceedings of the Eleventh European Workshop on Natural Language Generation (ENLG 07)

pdf bib
Generating Constituent Order in German Clauses
Katja Filippova | Michael Strube
Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics

2006

pdf bib
Using linguistically motivated features for paragraph boundary identification
Katja Filippova | Michael Strube
Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing