Johannes Daxenberger


pdf bib
How to Probe Sentence Embeddings in Low-Resource Languages: On Structural Design Choices for Probing Task Evaluation
Steffen Eger | Johannes Daxenberger | Iryna Gurevych
Proceedings of the 24th Conference on Computational Natural Language Learning

Sentence encoders map sentences to real valued vectors for use in downstream applications. To peek into these representations—e.g., to increase interpretability of their results—probing tasks have been designed which query them for linguistic knowledge. However, designing probing tasks for lesser-resourced languages is tricky, because these often lack largescale annotated data or (high-quality) dependency parsers as a prerequisite of probing task design in English. To investigate how to probe sentence embeddings in such cases, we investigate sensitivity of probing task results to structural design choices, conducting the first such large scale study. We show that design choices like size of the annotated probing dataset and type of classifier used for evaluation do (sometimes substantially) influence probing outcomes. We then probe embeddings in a multilingual setup with design choices that lie in a ‘stable region’, as we identify for English, and find that results on English do not transfer to other languages. Fairer and more comprehensive sentence-level probing evaluation should thus be carried out on multiple languages in the future.

pdf bib
Evaluation of Argument Search Approaches in the Context of Argumentative Dialogue Systems
Niklas Rach | Yuki Matsuda | Johannes Daxenberger | Stefan Ultes | Keiichi Yasumoto | Wolfgang Minker
Proceedings of the 12th Language Resources and Evaluation Conference

We present an approach to evaluate argument search techniques in view of their use in argumentative dialogue systems by assessing quality aspects of the retrieved arguments. To this end, we introduce a dialogue system that presents arguments by means of a virtual avatar and synthetic speech to users and allows them to rate the presented content in four different categories (Interesting, Convincing, Comprehensible, Relation). The approach is applied in a user study in order to compare two state of the art argument search engines to each other and with a system based on traditional web search. The results show a significant advantage of the two search engines over the baseline. Moreover, the two search engines show significant advantages over each other in different categories, thereby reflecting strengths and weaknesses of the different underlying techniques.


pdf bib
Classification and Clustering of Arguments with Contextualized Word Embeddings
Nils Reimers | Benjamin Schiller | Tilman Beck | Johannes Daxenberger | Christian Stab | Iryna Gurevych
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

We experiment with two recent contextualized word embedding methods (ELMo and BERT) in the context of open-domain argument search. For the first time, we show how to leverage the power of contextualized word embeddings to classify and cluster topic-dependent arguments, achieving impressive results on both tasks and across multiple datasets. For argument classification, we improve the state-of-the-art for the UKP Sentential Argument Mining Corpus by 20.8 percentage points and for the IBM Debater - Evidence Sentences dataset by 7.4 percentage points. For the understudied task of argument clustering, we propose a pre-training step which improves by 7.8 percentage points over strong baselines on a novel dataset, and by 12.3 percentage points for the Argument Facet Similarity (AFS) Corpus.


pdf bib
Multi-Task Learning for Argumentation Mining in Low-Resource Settings
Claudia Schulz | Steffen Eger | Johannes Daxenberger | Tobias Kahse | Iryna Gurevych
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers)

We investigate whether and where multi-task learning (MTL) can improve performance on NLP problems related to argumentation mining (AM), in particular argument component identification. Our results show that MTL performs particularly well (and better than single-task learning) when little training data is available for the main task, a common scenario in AM. Our findings challenge previous assumptions that conceptualizations across AM datasets are divergent and that MTL is difficult for semantic or higher-level tasks.

pdf bib
ArgumenText: Searching for Arguments in Heterogeneous Sources
Christian Stab | Johannes Daxenberger | Chris Stahlhut | Tristan Miller | Benjamin Schiller | Christopher Tauchmann | Steffen Eger | Iryna Gurevych
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations

Argument mining is a core technology for enabling argument search in large corpora. However, most current approaches fall short when applied to heterogeneous texts. In this paper, we present an argument retrieval system capable of retrieving sentential arguments for any given controversial topic. By analyzing the highest-ranked results extracted from Web sources, we found that our system covers 89% of arguments found in expert-curated lists of arguments from an online debate portal, and also identifies additional valid arguments.

pdf bib
Cross-lingual Argumentation Mining: Machine Translation (and a bit of Projection) is All You Need!
Steffen Eger | Johannes Daxenberger | Christian Stab | Iryna Gurevych
Proceedings of the 27th International Conference on Computational Linguistics

Argumentation mining (AM) requires the identification of complex discourse structures and has lately been applied with success monolingually. In this work, we show that the existing resources are, however, not adequate for assessing cross-lingual AM, due to their heterogeneity or lack of complexity. We therefore create suitable parallel corpora by (human and machine) translating a popular AM dataset consisting of persuasive student essays into German, French, Spanish, and Chinese. We then compare (i) annotation projection and (ii) bilingual word embeddings based direct transfer strategies for cross-lingual AM, finding that the former performs considerably better and almost eliminates the loss from cross-lingual transfer. Moreover, we find that annotation projection works equally well when using either costly human or cheap machine translations. Our code and data are available at


pdf bib
Neural End-to-End Learning for Computational Argumentation Mining
Steffen Eger | Johannes Daxenberger | Iryna Gurevych
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

We investigate neural techniques for end-to-end computational argumentation mining (AM). We frame AM both as a token-based dependency parsing and as a token-based sequence tagging problem, including a multi-task learning setup. Contrary to models that operate on the argument component level, we find that framing AM as dependency parsing leads to subpar performance results. In contrast, less complex (local) tagging models based on BiLSTMs perform robustly across classification scenarios, being able to catch long-range dependencies inherent to the AM problem. Moreover, we find that jointly learning ‘natural’ subtasks, in a multi-task learning setup, improves performance.

pdf bib
Distantly Supervised POS Tagging of Low-Resource Languages under Extreme Data Sparsity: The Case of Hittite
Maria Sukhareva | Francesco Fuscagni | Johannes Daxenberger | Susanne Görke | Doris Prechel | Iryna Gurevych
Proceedings of the Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature

This paper presents a statistical approach to automatic morphosyntactic annotation of Hittite transcripts. Hittite is an extinct Indo-European language using the cuneiform script. There are currently no morphosyntactic annotations available for Hittite, so we explored methods of distant supervision. The annotations were projected from parallel German translations of the Hittite texts. In order to reduce data sparsity, we applied stemming of German and Hittite texts. As there is no off-the-shelf Hittite stemmer, a stemmer for Hittite was developed for this purpose. The resulting annotation projections were used to train a POS tagger, achieving an accuracy of 69% on a test sample. To our knowledge, this is the first attempt of statistical POS tagging of a cuneiform language.

pdf bib
What is the Essence of a Claim? Cross-Domain Claim Identification
Johannes Daxenberger | Steffen Eger | Ivan Habernal | Christian Stab | Iryna Gurevych
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

Argument mining has become a popular research area in NLP. It typically includes the identification of argumentative components, e.g. claims, as the central component of an argument. We perform a qualitative analysis across six different datasets and show that these appear to conceptualize claims quite differently. To learn about the consequences of such different conceptualizations of claim for practical applications, we carried out extensive experiments using state-of-the-art feature-rich and deep learning systems, to identify claims in a cross-domain fashion. While the divergent conceptualization of claims in different datasets is indeed harmful to cross-domain classification, we show that there are shared properties on the lexical level as well as system configurations that can help to overcome these gaps.


pdf bib
Automatically Detecting Corresponding Edit-Turn-Pairs in Wikipedia
Johannes Daxenberger | Iryna Gurevych
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
DKPro TC: A Java-based Framework for Supervised Learning Experiments on Textual Data
Johannes Daxenberger | Oliver Ferschke | Iryna Gurevych | Torsten Zesch
Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations


pdf bib
Automatically Classifying Edit Categories in Wikipedia Revisions
Johannes Daxenberger | Iryna Gurevych
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing


pdf bib
A Corpus-Based Study of Edit Categories in Featured and Non-Featured Wikipedia Articles
Johannes Daxenberger | Iryna Gurevych
Proceedings of COLING 2012