Dietrich Klakow


2020

pdf bib
Neural Data-to-Text Generation via Jointly Learning the Segmentation and Correspondence
Xiaoyu Shen | Ernie Chang | Hui Su | Cheng Niu | Dietrich Klakow
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

The neural attention model has achieved great success in data-to-text generation tasks. Though usually excelling at producing fluent text, it suffers from the problem of information missing, repetition and “hallucination”. Due to the black-box nature of the neural attention architecture, avoiding these problems in a systematic way is non-trivial. To address this concern, we propose to explicitly segment target text into fragment units and align them with their data correspondences. The segmentation and correspondence are jointly learned as latent variables without any human annotations. We further impose a soft statistical constraint to regularize the segmental granularity. The resulting architecture maintains the same expressive power as neural attention models, while being able to generate fully interpretable outputs with several times less computational cost. On both E2E and WebNLG benchmarks, we show the proposed model consistently outperforms its neural attention counterparts.

pdf bib
CoLi at UdS at SemEval-2020 Task 12: Offensive Tweet Detection with Ensembling
Kathryn Chapman | Johannes Bernhard | Dietrich Klakow
Proceedings of the Fourteenth Workshop on Semantic Evaluation

We present our submission and results for SemEval-2020 Task 12: Multilingual Offensive Language Identification in Social Media (OffensEval 2020) where we participated in offensive tweet classification tasks in English, Arabic, Greek, Turkish and Danish. Our approach included classical machine learning architectures such as support vector machines and logistic regression combined in an ensemble with a multilingual transformer-based model (XLM-R). The transformer model is trained on all languages combined in order to create a fully multilingual model which can leverage knowledge between languages. The machine learning model hyperparameters are fine-tuned and the statistically best performing ones included in the final ensemble.

pdf bib
ATC-ANNO: Semantic Annotation for Air Traffic Control with Assistive Auto-Annotation
Marc Schulder | Johannah O’Mahony | Yury Bakanouski | Dietrich Klakow
Proceedings of the 12th Language Resources and Evaluation Conference

In air traffic control, assistant systems support air traffic controllers in their work. To improve the reactivity and accuracy of the assistant, automatic speech recognition can monitor the commands uttered by the controller. However, to provide sufficient training data for the speech recognition system, many hours of air traffic communications have to be transcribed and semantically annotated. For this purpose we developed the annotation tool ATC-ANNO. It provides a number of features to support the annotator in their task, such as auto-complete suggestions for semantic tags, access to preliminary speech recognition predictions, syntax highlighting and consistency indicators. Its core assistive feature, however, is its ability to automatically generate semantic annotations. Although it is based on a simple hand-written finite state grammar, it is also able to annotate sentences that deviate from this grammar. We evaluate the impact of different features on annotator efficiency and find that automatic annotation allows annotators to cover four times as many utterances in the same time.

pdf bib
Proceedings of the Second Workshop on Beyond Vision and LANguage: inTEgrating Real-world kNowledge (LANTERN)
Aditya Mogadala | Sandro Pezzelle | Dietrich Klakow | Marie-Francine Moens | Zeynep Akata
Proceedings of the Second Workshop on Beyond Vision and LANguage: inTEgrating Real-world kNowledge (LANTERN)

pdf bib
On the Interplay Between Fine-tuning and Sentence-level Probing for Linguistic Knowledge in Pre-trained Transformers
Marius Mosbach | Anna Khokhlova | Michael A. Hedderich | Dietrich Klakow
Findings of the Association for Computational Linguistics: EMNLP 2020

Fine-tuning pre-trained contextualized embedding models has become an integral part of the NLP pipeline. At the same time, probing has emerged as a way to investigate the linguistic knowledge captured by pre-trained models. Very little is, however, understood about how fine-tuning affects the representations of pre-trained models and thereby the linguistic knowledge they encode. This paper contributes towards closing this gap. We study three different pre-trained models: BERT, RoBERTa, and ALBERT, and investigate through sentence-level probing how fine-tuning affects their representations. We find that for some probing tasks fine-tuning leads to substantial changes in accuracy, possibly suggesting that fine-tuning introduces or even removes linguistic knowledge from a pre-trained model. These changes, however, vary greatly across different models, fine-tuning and probing tasks. Our analysis reveals that while fine-tuning indeed changes the representations of a pre-trained model and these changes are typically larger for higher layers, only in very few cases, fine-tuning has a positive effect on probing accuracy that is larger than just using the pre-trained model with a strong pooling method. Based on our findings, we argue that both positive and negative effects of fine-tuning on probing require a careful interpretation.

pdf bib
Rediscovering the Slavic Continuum in Representations Emerging from Neural Models of Spoken Language Identification
Badr M. Abdullah | Jacek Kudera | Tania Avgustinova | Bernd Möbius | Dietrich Klakow
Proceedings of the 7th Workshop on NLP for Similar Languages, Varieties and Dialects

Deep neural networks have been employed for various spoken language recognition tasks, including tasks that are multilingual by definition such as spoken language identification (LID). In this paper, we present a neural model for Slavic language identification in speech signals and analyze its emergent representations to investigate whether they reflect objective measures of language relatedness or non-linguists’ perception of language similarity. While our analysis shows that the language representation space indeed captures language relatedness to a great extent, we find perceptual confusability to be the best predictor of the language representation similarity.

pdf bib
Label Propagation-Based Semi-Supervised Learning for Hate Speech Classification
Ashwin Geet D’Sa | Irina Illina | Dominique Fohr | Dietrich Klakow | Dana Ruiter
Proceedings of the First Workshop on Insights from Negative Results in NLP

Research on hate speech classification has received increased attention. In real-life scenarios, a small amount of labeled hate speech data is available to train a reliable classifier. Semi-supervised learning takes advantage of a small amount of labeled data and a large amount of unlabeled data. In this paper, label propagation-based semi-supervised learning is explored for the task of hate speech classification. The quality of labeling the unlabeled set depends on the input representations. In this work, we show that pre-trained representations are label agnostic, and when used with label propagation yield poor results. Neural network-based fine-tuning can be adopted to learn task-specific representations using a small amount of labeled data. We show that fully fine-tuned representations may not always be the best representations for the label propagation and intermediate representations may perform better in a semi-supervised setup.

pdf bib
A Closer Look at Linguistic Knowledge in Masked Language Models: The Case of Relative Clauses in American English
Marius Mosbach | Stefania Degaetano-Ortlieb | Marie-Pauline Krielke | Badr M. Abdullah | Dietrich Klakow
Proceedings of the 28th International Conference on Computational Linguistics

Transformer-based language models achieve high performance on various tasks, but we still lack understanding of the kind of linguistic knowledge they learn and rely on. We evaluate three models (BERT, RoBERTa, and ALBERT), testing their grammatical and semantic knowledge by sentence-level probing, diagnostic cases, and masked prediction tasks. We focus on relative clauses (in American English) as a complex phenomenon needing contextual information and antecedent identification to be resolved. Based on a naturalistic dataset, probing shows that all three models indeed capture linguistic knowledge about grammaticality, achieving high performance.Evaluation on diagnostic cases and masked prediction tasks considering fine-grained linguistic knowledge, however, shows pronounced model-specific weaknesses especially on semantic knowledge, strongly impacting models’ performance. Our results highlight the importance of (a)model comparison in evaluation task and (b) building up claims of model performance and the linguistic knowledge they capture beyond purely probing-based evaluations.

pdf bib
Transfer Learning and Distant Supervision for Multilingual Transformer Models: A Study on African Languages
Michael A. Hedderich | David Adelani | Dawei Zhu | Jesujoba Alabi | Udia Markus | Dietrich Klakow
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Multilingual transformer models like mBERT and XLM-RoBERTa have obtained great improvements for many NLP tasks on a variety of languages. However, recent works also showed that results from high-resource languages could not be easily transferred to realistic, low-resource scenarios. In this work, we study trends in performance for different amounts of available resources for the three African languages Hausa, isiXhosa and on both NER and topic classification. We show that in combination with transfer learning or distant supervision, these models can achieve with as little as 10 or 100 labeled sentences the same performance as baselines with much more supervised training data. However, we also find settings where this does not hold. Our discussions and additional experiments on assumptions such as time and hardware restrictions highlight challenges and opportunities in low-resource learning.

pdf bib
HUMAN: Hierarchical Universal Modular ANnotator
Moritz Wolf | Dana Ruiter | Ashwin Geet D’Sa | Liane Reiners | Jan Alexandersson | Dietrich Klakow
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations

A lot of real-world phenomena are complex and cannot be captured by single task annotations. This causes a need for subsequent annotations, with interdependent questions and answers describing the nature of the subject at hand. Even in the case a phenomenon is easily captured by a single task, the high specialisation of most annotation tools can result in having to switch to another tool if the task only slightly changes. We introduce HUMAN, a novel web-based annotation tool that addresses the above problems by a) covering a variety of annotation tasks on both textual and image data, and b) the usage of an internal deterministic state machine, allowing the researcher to chain different annotation tasks in an interdependent manner. Further, the modular nature of the tool makes it easy to define new annotation tasks and integrate machine learning algorithms e.g., for active learning. HUMAN comes with an easy-to-use graphical user interface that simplifies the annotation task and management.

pdf bib
On the Interplay Between Fine-tuning and Sentence-Level Probing for Linguistic Knowledge in Pre-Trained Transformers
Marius Mosbach | Anna Khokhlova | Michael A. Hedderich | Dietrich Klakow
Proceedings of the Third BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP

Fine-tuning pre-trained contextualized embedding models has become an integral part of the NLP pipeline. At the same time, probing has emerged as a way to investigate the linguistic knowledge captured by pre-trained models. Very little is, however, understood about how fine-tuning affects the representations of pre-trained models and thereby the linguistic knowledge they encode. This paper contributes towards closing this gap. We study three different pre-trained models: BERT, RoBERTa, and ALBERT, and investigate through sentence-level probing how fine-tuning affects their representations. We find that for some probing tasks fine-tuning leads to substantial changes in accuracy, possibly suggesting that fine-tuning introduces or even removes linguistic knowledge from a pre-trained model. These changes, however, vary greatly across different models, fine-tuning and probing tasks. Our analysis reveals that while fine-tuning indeed changes the representations of a pre-trained model and these changes are typically larger for higher layers, only in very few cases, fine-tuning has a positive effect on probing accuracy that is larger than just using the pre-trained model with a strong pooling method. Based on our findings, we argue that both positive and negative effects of fine-tuning on probing require a careful interpretation.

pdf bib
Defining Explanation in an AI Context
Tejaswani Verma | Christoph Lingenfelder | Dietrich Klakow
Proceedings of the Third BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP

With the increase in the use of AI systems, a need for explanation systems arises. Building an explanation system requires a definition of explanation. However, the natural language term explanation is difficult to define formally as it includes multiple perspectives from different domains such as psychology, philosophy, and cognitive sciences. We study multiple perspectives and aspects of explainability of recommendations or predictions made by AI systems, and provide a generic definition of explanation. The proposed definition is ambitious and challenging to apply. With the intention to bridge the gap between theory and application, we also propose a possible architecture of an automated explanation system based on our definition of explanation.

2019

pdf bib
Select and Attend: Towards Controllable Content Selection in Text Generation
Xiaoyu Shen | Jun Suzuki | Kentaro Inui | Hui Su | Dietrich Klakow | Satoshi Sekine
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Many text generation tasks naturally contain two steps: content selection and surface realization. Current neural encoder-decoder models conflate both steps into a black-box architecture. As a result, the content to be described in the text cannot be explicitly controlled. This paper tackles this problem by decoupling content selection from the decoder. The decoupled content selection is human interpretable, whose value can be manually manipulated to control the content of generated text. The model can be trained end-to-end without human annotations by maximizing a lower bound of the marginal likelihood. We further propose an effective way to trade-off between performance and controllability with a single adjustable hyperparameter. In both data-to-text and headline generation tasks, our model achieves promising results, paving the way for controllable content selection in text generation.

pdf bib
Feature-Dependent Confusion Matrices for Low-Resource NER Labeling with Noisy Labels
Lukas Lange | Michael A. Hedderich | Dietrich Klakow
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

In low-resource settings, the performance of supervised labeling models can be improved with automatically annotated or distantly supervised data, which is cheap to create but often noisy. Previous works have shown that significant improvements can be reached by injecting information about the confusion between clean and noisy labels in this additional training data into the classifier training. However, for noise estimation, these approaches either do not take the input features (in our case word embeddings) into account, or they need to learn the noise modeling from scratch which can be difficult in a low-resource setting. We propose to cluster the training data using the input features and then compute different confusion matrices for each cluster. To the best of our knowledge, our approach is the first to leverage feature-dependent noise modeling with pre-initialized confusion matrices. We evaluate on low-resource named entity recognition settings in several languages, showing that our methods improve upon other confusion-matrix based methods by up to 9%.

pdf bib
Improving Latent Alignment in Text Summarization by Generalizing the Pointer Generator
Xiaoyu Shen | Yang Zhao | Hui Su | Dietrich Klakow
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Pointer Generators have been the de facto standard for modern summarization systems. However, this architecture faces two major drawbacks: Firstly, the pointer is limited to copying the exact words while ignoring possible inflections or abstractions, which restricts its power of capturing richer latent alignment. Secondly, the copy mechanism results in a strong bias towards extractive generations, where most sentences are produced by simply copying from the source text. In this paper, we address these problems by allowing the model to “edit” pointed tokens instead of always hard copying them. The editing is performed by transforming the pointed word vector into a target space with a learned relation embedding. On three large-scale summarization dataset, we show the model is able to (1) capture more latent alignment relations than exact word matches, (2) improve word alignment accuracy, allowing for better model interpretation and controlling, (3) generate higher-quality summaries validated by both qualitative and quantitative evaluations and (4) bring more abstraction to the generated summaries.

pdf bib
Proceedings of the Beyond Vision and LANguage: inTEgrating Real-world kNowledge (LANTERN)
Aditya Mogadala | Dietrich Klakow | Sandro Pezzelle | Marie-Francine Moens
Proceedings of the Beyond Vision and LANguage: inTEgrating Real-world kNowledge (LANTERN)

pdf bib
Using Multi-Sense Vector Embeddings for Reverse Dictionaries
Michael A. Hedderich | Andrew Yates | Dietrich Klakow | Gerard de Melo
Proceedings of the 13th International Conference on Computational Semantics - Long Papers

Popular word embedding methods such as word2vec and GloVe assign a single vector representation to each word, even if a word has multiple distinct meanings. Multi-sense embeddings instead provide different vectors for each sense of a word. However, they typically cannot serve as a drop-in replacement for conventional single-sense embeddings, because the correct sense vector needs to be selected for each word. In this work, we study the effect of multi-sense embeddings on the task of reverse dictionaries. We propose a technique to easily integrate them into an existing neural network architecture using an attention mechanism. Our experiments demonstrate that large improvements can be obtained when employing multi-sense embeddings both in the input sequence as well as for the target representation. An analysis of the sense distributions and of the learned attention is provided as well.

pdf bib
Some steps towards the generation of diachronic WordNets
Yuri Bizzoni | Marius Mosbach | Dietrich Klakow | Stefania Degaetano-Ortlieb
Proceedings of the 22nd Nordic Conference on Computational Linguistics

We apply hyperbolic embeddings to trace the dynamics of change of conceptual-semantic relationships in a large diachronic scientific corpus (200 years). Our focus is on emerging scientific fields and the increasingly specialized terminology establishing around them. Reproducing high-quality hierarchical structures such as WordNet on a diachronic scale is a very difficult task. Hyperbolic embeddings can map partial graphs into low dimensional, continuous hierarchical spaces, making more explicit the latent structure of the input. We show that starting from simple lists of word pairs (rather than a list of entities with directional links) it is possible to build diachronic hierarchical semantic spaces which allow us to model a process towards specialization for selected scientific fields.

pdf bib
incom.py - A Toolbox for Calculating Linguistic Distances and Asymmetries between Related Languages
Marius Mosbach | Irina Stenger | Tania Avgustinova | Dietrich Klakow
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019)

Languages may be differently distant from each other and their mutual intelligibility may be asymmetric. In this paper we introduce incom.py, a toolbox for calculating linguistic distances and asymmetries between related languages. incom.py allows linguist experts to quickly and easily perform statistical analyses and compare those with experimental results. We demonstrate the efficacy of incom.py in an incomprehension experiment on two Slavic languages: Bulgarian and Russian. Using incom.py we were able to validate three methods to measure linguistic distances and asymmetries: Levenshtein distance, word adaptation surprisal, and conditional entropy as predictors of success in a reading intercomprehension experiment.

pdf bib
Term-Based Extraction of Medical Information: Pre-Operative Patient Education Use Case
Martin Wolf | Volha Petukhova | Dietrich Klakow
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019)

The processing of medical information is not a trivial task for medical non-experts. The paper presents an artificial assistant designed to facilitate a reliable access to medical online contents. Interactions are modelled as doctor-patient Question Answering sessions within a pre-operative patient education scenario where the system addresses patient’s information needs explaining medical events and procedures. This implies an accurate medical information extraction from and reasoning with available medical knowledge and large amounts of unstructured multilingual online data. Bridging the gap between medical knowledge and data, we explore a language-agnostic approach to medical concepts mining from the standard terminologies, and the data-driven collection of the corresponding seed terms in a distant supervision setting for German. Experimenting with different terminologies, features and term matching strategies, we achieved a promising F-score of 0.91 on the medical term extraction task. The concepts and terms are used to search and retrieve definitions from the verified online free resources. The proof-of-concept definition retrieval system is designed and evaluated showing promising results, acceptable by humans in 92% of cases.

pdf bib
Cross-lingual Transfer Learning for Japanese Named Entity Recognition
Andrew Johnson | Penny Karanasou | Judith Gaspers | Dietrich Klakow
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Industry Papers)

This work explores cross-lingual transfer learning (TL) for named entity recognition, focusing on bootstrapping Japanese from English. A deep neural network model is adopted and the best combination of weights to transfer is extensively investigated. Moreover, a novel approach is presented that overcomes linguistic differences between this language pair by romanizing a portion of the Japanese input. Experiments are conducted on external datasets, as well as internal large-scale real-world ones. Gains with TL are achieved for all evaluated cases. Finally, the influence on TL of the target dataset size and of the target tagset distribution is further investigated.

pdf bib
Handling Noisy Labels for Robustly Learning from Self-Training Data for Low-Resource Sequence Labeling
Debjit Paul | Mittul Singh | Michael A. Hedderich | Dietrich Klakow
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop

In this paper, we address the problem of effectively self-training neural networks in a low-resource setting. Self-training is frequently used to automatically increase the amount of training data. However, in a low-resource scenario, it is less effective due to unreliable annotations created using self-labeling of unlabeled data. We propose to combine self-training with noise handling on the self-labeled data. Directly estimating noise on the combined clean training set and self-labeled data can lead to corruption of the clean data and hence, performs worse. Thus, we propose the Clean and Noisy Label Neural Network which trains on clean and noisy self-labeled data simultaneously by explicitly modelling clean and noisy labels separately. In our experiments on Chunking and NER, this approach performs more robustly than the baselines. Complementary to this explicit approach, noise can also be handled implicitly with the help of an auxiliary learning task. To such a complementary approach, our method is more beneficial than other baseline methods and together provides the best performance overall.

2018

pdf bib
The Metalogue Debate Trainee Corpus: Data Collection and Annotations
Volha Petukhova | Andrei Malchanau | Youssef Oualil | Dietrich Klakow | Saturnino Luz | Fasih Haider | Nick Campbell | Dimitris Koryzis | Dimitris Spiliotopoulos | Pierre Albert | Nicklas Linz | Jan Alexandersson
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

pdf bib
NEXUS Network: Connecting the Preceding and the Following in Dialogue Generation
Xiaoyu Shen | Hui Su | Wenjie Li | Dietrich Klakow
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Sequence-to-Sequence (seq2seq) models have become overwhelmingly popular in building end-to-end trainable dialogue systems. Though highly efficient in learning the backbone of human-computer communications, they suffer from the problem of strongly favoring short generic responses. In this paper, we argue that a good response should smoothly connect both the preceding dialogue history and the following conversations. We strengthen this connection by mutual information maximization. To sidestep the non-differentiability of discrete natural language tokens, we introduce an auxiliary continuous code space and map such code space to a learnable prior distribution for generation purpose. Experiments on two dialogue datasets validate the effectiveness of our model, where the generated responses are closely related to the dialogue context and lead to more interactive conversations.

pdf bib
Training a Neural Network in a Low-Resource Setting on Automatically Annotated Noisy Data
Michael A. Hedderich | Dietrich Klakow
Proceedings of the Workshop on Deep Learning Approaches for Low-Resource NLP

Manually labeled corpora are expensive to create and often not available for low-resource languages or domains. Automatic labeling approaches are an alternative way to obtain labeled data in a quicker and cheaper way. However, these labels often contain more errors which can deteriorate a classifier’s performance when trained on this data. We propose a noise layer that is added to a neural network architecture. This allows modeling the noise and train on a combination of clean and noisy data. We show that in a low-resource NER task we can improve performance by up to 35% by using additional, noisy data and handling the noise.

pdf bib
Closing Brackets with Recurrent Neural Networks
Natalia Skachkova | Thomas Trost | Dietrich Klakow
Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP

Many natural and formal languages contain words or symbols that require a matching counterpart for making an expression well-formed. The combination of opening and closing brackets is a typical example of such a construction. Due to their commonness, the ability to follow such rules is important for language modeling. Currently, recurrent neural networks (RNNs) are extensively used for this task. We investigate whether they are capable of learning the rules of opening and closing brackets by applying them to synthetic Dyck languages that consist of different types of brackets. We provide an analysis of the statistical properties of these languages as a baseline and show strengths and limits of Elman-RNNs, GRUs and LSTMs in experiments on random samples of these languages. In terms of perplexity and prediction accuracy, the RNNs get close to the theoretical baseline in most cases.

pdf bib
Toward Bayesian Synchronous Tree Substitution Grammars for Sentence Planning
David M. Howcroft | Dietrich Klakow | Vera Demberg
Proceedings of the 11th International Conference on Natural Language Generation

Developing conventional natural language generation systems requires extensive attention from human experts in order to craft complex sets of sentence planning rules. We propose a Bayesian nonparametric approach to learn sentence planning rules by inducing synchronous tree substitution grammars for pairs of text plans and morphosyntactically-specified dependency trees. Our system is able to learn rules which can be used to generate novel texts after training on small datasets.

2017

pdf bib
Parameter Free Hierarchical Graph-Based Clustering for Analyzing Continuous Word Embeddings
Thomas Alexander Trost | Dietrich Klakow
Proceedings of TextGraphs-11: the Workshop on Graph-based Methods for Natural Language Processing

Word embeddings are high-dimensional vector representations of words and are thus difficult to interpret. In order to deal with this, we introduce an unsupervised parameter free method for creating a hierarchical graphical clustering of the full ensemble of word vectors and show that this structure is a geometrically meaningful representation of the original relations between the words. This newly obtained representation can be used for better understanding and thus improving the embedding algorithm and exhibits semantic meaning, so it can also be utilized in a variety of language processing tasks like categorization or measuring similarity.

2016

pdf bib
Creating Annotated Dialogue Resources: Cross-domain Dialogue Act Classification
Dilafruz Amanova | Volha Petukhova | Dietrich Klakow
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

This paper describes a method to automatically create dialogue resources annotated with dialogue act information by reusing existing dialogue corpora. Numerous dialogue corpora are available for research purposes and many of them are annotated with dialogue act information that captures the intentions encoded in user utterances. Annotated dialogue resources, however, differ in various respects: data collection settings and modalities used, dialogue task domains and scenarios (if any) underlying the collection, number and roles of dialogue participants involved and dialogue act annotation schemes applied. The presented study encompasses three phases of data-driven investigation. We, first, assess the importance of various types of features and their combinations for effective cross-domain dialogue act classification. Second, we establish the best predictive model comparing various cross-corpora training settings. Finally, we specify models adaptation procedures and explore late fusion approaches to optimize the overall classification decision taking process. The proposed methodology accounts for empirically motivated and technically sound classification procedures that may reduce annotation and training costs significantly.

pdf bib
Orthographic and Morphological Correspondences between Related Slavic Languages as a Base for Modeling of Mutual Intelligibility
Andrea Fischer | Klára Jágrová | Irina Stenger | Tania Avgustinova | Dietrich Klakow | Roland Marti
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

In an intercomprehension scenario, typically a native speaker of language L1 is confronted with output from an unknown, but related language L2. In this setting, the degree to which the receiver recognizes the unfamiliar words greatly determines communicative success. Despite exhibiting great string-level differences, cognates may be recognized very successfully if the receiver is aware of regular correspondences which allow to transform the unknown word into its familiar form. Modeling L1-L2 intercomprehension then requires the identification of all the regular correspondences between languages L1 and L2. We here present a set of linguistic orthographic correspondences manually compiled from comparative linguistics literature along with a set of statistically-inferred suggestions for correspondence rules. In order to do statistical inference, we followed the Minimum Description Length principle, which proposes to choose those rules which are most effective at describing the data. Our statistical model was able to reproduce most of our linguistic correspondences (88.5% for Czech-Polish and 75.7% for Bulgarian-Russian) and furthermore allowed to easily identify many more non-trivial correspondences which also cover aspects of morphology.

pdf bib
Event participant modelling with neural networks
Ottokar Tilk | Vera Demberg | Asad Sayeed | Dietrich Klakow | Stefan Thater
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf bib
Long-Short Range Context Neural Networks for Language Modeling
Youssef Oualil | Mittul Singh | Clayton Greenberg | Dietrich Klakow
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf bib
Sub-Word Similarity based Search for Embeddings: Inducing Rare-Word Embeddings for Word Similarity Tasks and Language Modelling
Mittul Singh | Clayton Greenberg | Youssef Oualil | Dietrich Klakow
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Training good word embeddings requires large amounts of data. Out-of-vocabulary words will still be encountered at test-time, leaving these words without embeddings. To overcome this lack of embeddings for rare words, existing methods leverage morphological features to generate embeddings. While the existing methods use computationally-intensive rule-based (Soricut and Och, 2015) or tool-based (Botha and Blunsom, 2014) morphological analysis to generate embeddings, our system applies a computationally-simpler sub-word search on words that have existing embeddings. Embeddings of the sub-word search results are then combined using string similarity functions to generate rare word embeddings. We augmented pre-trained word embeddings with these novel embeddings and evaluated on a rare word similarity task, obtaining up to 3 times improvement in correlation over the original set of embeddings. Applying our technique to embeddings trained on larger datasets led to on-par performance with the existing state-of-the-art for this task. Additionally, while analysing augmented embeddings in a log-bilinear language model, we observed up to 50% reduction in rare word perplexity in comparison to other more complex language models.

pdf bib
Unsupervised morph segmentation and statistical language models for vocabulary expansion
Matti Varjokallio | Dietrich Klakow
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

2015

pdf bib
Linguistically Motivated Question Classification
Alexandr Chernov | Volha Petukhova | Dietrich Klakow
Proceedings of the 20th Nordic Conference of Computational Linguistics (NODALIDA 2015)

pdf bib
Towards Flexible, Small-Domain Surface Generation: Combining Data-Driven and Grammatical Approaches
Andrea Fischer | Vera Demberg | Dietrich Klakow
Proceedings of the 15th European Workshop on Natural Language Generation (ENLG)

2014

pdf bib
Automatic Food Categorization from Large Unlabeled Corpora and Its Impact on Relation Extraction
Michael Wiegand | Benjamin Roth | Dietrich Klakow
Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics

pdf bib
RelationFactory: A Fast, Modular and Effective System for Knowledge Base Population
Benjamin Roth | Tassilo Barth | Grzegorz Chrupała | Martin Gropp | Dietrich Klakow
Proceedings of the Demonstrations at the 14th Conference of the European Chapter of the Association for Computational Linguistics

pdf bib
Unsupervised Parsing for Generating Surface-Based Relation Extraction Patterns
Jens Illig | Benjamin Roth | Dietrich Klakow
Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, volume 2: Short Papers

pdf bib
Separating Brands from Types: an Investigation of Different Features for the Food Domain
Michael Wiegand | Dietrich Klakow
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

pdf bib
The DBOX Corpus Collection of Spoken Human-Human and Human-Machine Dialogues
Volha Petukhova | Martin Gropp | Dietrich Klakow | Gregor Eigner | Mario Topf | Stefan Srb | Petr Motlicek | Blaise Potard | John Dines | Olivier Deroo | Ronny Egeler | Uwe Meinz | Steffen Liersch | Anna Schmidt
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

This paper describes the data collection and annotation carried out within the DBOX project ( Eureka project, number E! 7152). This project aims to develop interactive games based on spoken natural language human-computer dialogues, in 3 European languages: English, German and French. We collect the DBOX data continuously. We first start with human-human Wizard of Oz experiments to collect human-human data in order to model natural human dialogue behaviour, for better understanding of phenomena of human interactions and predicting interlocutors actions, and then replace the human Wizard by an increasingly advanced dialogue system, using evaluation data for system improvement. The designed dialogue system relies on a Question-Answering (QA) approach, but showing truly interactive gaming behaviour, e.g., by providing feedback, managing turns and contact, producing social signals and acts, e.g., encouraging vs. downplaying, polite vs. rude, positive vs. negative attitude towards players or their actions, etc. The DBOX dialogue corpus has required substantial investment. We expect it to have a great impact on the rest of the project. The DBOX project consortium will continue to maintain the corpus and to take an interest in its growth, e.g., expand to other languages. The resulting corpus will be publicly released.

2013

pdf bib
Combining Generative and Discriminative Model Scores for Distant Supervision
Benjamin Roth | Dietrich Klakow
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

pdf bib
Towards Contextual Healthiness Classification of Food Items - A Linguistic Approach
Michael Wiegand | Dietrich Klakow
Proceedings of the Sixth International Joint Conference on Natural Language Processing

pdf bib
Towards the Detection of Reliable Food-Health Relationships
Michael Wiegand | Dietrich Klakow
Proceedings of the Workshop on Language Analysis in Social Media

pdf bib
Predicative Adjectives: An Unsupervised Criterion to Extract Subjective Adjectives
Michael Wiegand | Josef Ruppenhofer | Dietrich Klakow
Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

2012

pdf bib
A Gold Standard for Relation Extraction in the Food Domain
Michael Wiegand | Benjamin Roth | Eva Lasarcyk | Stephanie Köser | Dietrich Klakow
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

We present a gold standard for semantic relation extraction in the food domain for German. The relation types that we address are motivated by scenarios for which IT applications present a commercial potential, such as virtual customer advice in which a virtual agent assists a customer in a supermarket in finding those products that satisfy their needs best. Moreover, we focus on those relation types that can be extracted from natural language text corpora, ideally content from the internet, such as web forums, that are easy to retrieve. A typical relation type that meets these requirements are pairs of food items that are usually consumed together. Such a relation type could be used by a virtual agent to suggest additional products available in a shop that would potentially complement the items a customer has already in their shopping cart. Our gold standard comprises structural data, i.e. relation tables, which encode relation instances. These tables are vital in order to evaluate natural language processing systems that extract those relations.

pdf bib
Task-Driven Linguistic Analysis based on an Underspecified Features Representation
Stasinos Konstantopoulos | Valia Kordoni | Nicola Cancedda | Vangelis Karkaletsis | Dietrich Klakow | Jean-Michel Renders
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

In this paper we explore a task-driven approach to interfacing NLP components, where language processing is guided by the end-task that each application requires. The core idea is to generalize feature values into feature value distributions, representing under-specified feature values, and to fit linguistic pipelines with a back-channel of specification requests through which subsequent components can declare to preceding ones the importance of narrowing the value distribution of particular features that are critical for the current task.

pdf bib
Generalization Methods for In-Domain and Cross-Domain Opinion Holder Extraction
Michael Wiegand | Dietrich Klakow
Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics

2011

pdf bib
Prototypical Opinion Holders: What We can Learn from Experts and Analysts
Michael Wiegand | Dietrich Klakow
Proceedings of the International Conference Recent Advances in Natural Language Processing 2011

pdf bib
The Role of Predicates in Opinion Holder Extraction
Michael Wiegand | Dietrich Klakow
Proceedings of the RANLP 2011 Workshop on Information Extraction and Knowledge Acquisition

pdf bib
Convolution Kernels for Subjectivity Detection
Michael Wiegand | Dietrich Klakow
Proceedings of the 18th Nordic Conference of Computational Linguistics (NODALIDA 2011)

2010

pdf bib
A Comparative Study of Word Co-occurrence for Term Clustering in Language Model-based Sentence Retrieval
Saeedeh Momtazi | Sanjeev Khudanpur | Dietrich Klakow
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics

pdf bib
Convolution Kernels for Opinion Holder Extraction
Michael Wiegand | Dietrich Klakow
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics

pdf bib
A survey on the role of negation in sentiment analysis
Michael Wiegand | Alexandra Balahur | Benjamin Roth | Dietrich Klakow | Andrés Montoyo
Proceedings of the Workshop on Negation and Speculation in Natural Language Processing

pdf bib
Paragraph Acquisition and Selection for List Question Using Amazon’s Mechanical Turk
Fang Xu | Dietrich Klakow
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

Creating more fine-grained annotated data than previously relevent document sets is important for evaluating individual components in automatic question answering systems. In this paper, we describe using the Amazon's Mechanical Turk (AMT) to judge whether paragraphs in relevant documents answer corresponding list questions in TREC QA track 2004. Based on AMT results, we build a collection of 1300 gold-standard supporting paragraphs for list questions. Our online experiments suggested that recruiting more people per task assures better annotation quality. In order to learning true labels from AMT annotations, we investigated three approaches on two datasets with different levels of annotation errors. Experimental studies show that the Naive Bayesian model and EM-based GLAD model can generate results highly agreeing with gold-standard annotations, and dominate significantly over the majority voting method for true label learning. We also suggested setting higher HIT approval rate to assure better online annotation quality, which leads to better performance of learning methods.

pdf bib
Predictive Features for Detecting Indefinite Polar Sentences
Michael Wiegand | Dietrich Klakow
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

In recent years, text classification in sentiment analysis has mostly focused on two types of classification, the distinction between objective and subjective text, i.e. subjectivity detection, and the distinction between positive and negative subjective text, i.e. polarity classification. So far, there has been little work examining the distinction between definite polar subjectivity and indefinite polar subjectivity. While the former are utterances which can be categorized as either positive or negative, the latter cannot be categorized as either of these two categories. This paper presents a small set of domain independent features to detect indefinite polar sentences. The features reflect the linguistic structure underlying these types of utterances. We give evidence for the effectiveness of these features by incorporating them into an unsupervised rule-based classifier for sentence-level analysis and compare its performance with supervised machine learning classifiers, i.e. Support Vector Machines (SVMs) and Nearest Neighbor Classifier (kNN). The data used for the experiments are web-reviews collected from three different domains.

pdf bib
A Named Entity Labeler for German: Exploiting Wikipedia and Distributional Clusters
Grzegorz Chrupała | Dietrich Klakow
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

Named Entity Recognition is a relatively well-understood NLP task, with many publicly available training resources and software for processing English data. Other languages tend to be underserved in this area. For German, CoNLL-2003 Shared Task provided training data, but there are no publicly available, ready-to-use tools. We fill this gap and develop a German NER system with state-of-the-art performance. In addition to CoNLL 2003 labeled training data, we use two additional resources: (i) 32 million words of unlabeled news article text and (ii) infobox labels from German Wikipedia articles. From the unlabeled text we derive distributional word clusters. Then we use cluster membership features and Wikipedia infobox label features to train a supervised model on the labeled training data. This approach allows us to deal better with word-types unseen in the training data and achieve good performance on German with little engineering effort.

2009

pdf bib
Predictive Features in Semi-Supervised Learning for Polarity Classification and the Role of Adjectives
Michael Wiegand | Dietrich Klakow
Proceedings of the 17th Nordic Conference of Computational Linguistics (NODALIDA 2009)

2008

pdf bib
Cost-Sensitive Learning in Answer Extraction
Michael Wiegand | Jochen L. Leidner | Dietrich Klakow
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

One problem of data-driven answer extraction in open-domain factoid question answering is that the class distribution of labeled training data is fairly imbalanced. In an ordinary training set, there are far more incorrect answers than correct answers. The class-imbalance is, thus, inherent to the classification task. It has a deteriorating effect on the performance of classifiers trained by standard machine learning algorithms. They usually have a heavy bias towards the majority class, i.e. the class which occurs most often in the training set. In this paper, we propose a method to tackle class imbalance by applying some form of cost-sensitive learning which is preferable to sampling. We present a simple but effective way of estimating the misclassification costs on the basis of class distribution. This approach offers three benefits. Firstly, it maintains the distribution of the classes of the labeled training data. Secondly, this form of meta-learning can be applied to a wide range of common learning algorithms. Thirdly, this approach can be easily implemented with the help of state-of-the-art machine learning software.

2006

pdf bib
Exploring Correlation of Dependency Relation Paths for Answer Extraction
Dan Shen | Dietrich Klakow
Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics

pdf bib
Building an Evaluation Corpus for German Question Answering by Harvesting Wikipedia
Irene Cramer | Jochen L. Leidner | Dietrich Klakow
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

The growing interest in open-domain question answering is limited by the lack of evaluation and training resources. To overcome this resource bottleneck for German, we propose a novel methodology to acquire new question-answer pairs for system evaluation that relies on volunteer collaboration over the Internet. Utilizing Wikipedia, a popular free online encyclopedia available in several languages, we show that the data acquisition problem can be cast as a Web experiment. We present a Web-based annotation tool and carry out a distributed data collection experiment. The data gathered from the mostly anonymous contributors is compared to a similar dataset produced in-house by domain experts on the one hand, and the German questions from the from the CLEF QA 2004 effort on the other hand. Our analysis of the datasets suggests that using our novel method a medium-scale evaluation resource can be built at very small cost in a short period of time. The technique and software developed here is readily applicable to other languages where free online encyclopedias are available, and our resulting corpus is likewise freely available.

2005

pdf bib
Exploring Syntactic Relation Patterns for Question Answering
Dan Shen | Geert-Jan M. Kruijff | Dietrich Klakow
Second International Joint Conference on Natural Language Processing: Full Papers

pdf bib
Studying Feature Generation from Various Data Representations for Answer Extraction
Dan Shen | Geert-Jan M. Kruijff | Dietrich Klakow
Proceedings of the ACL Workshop on Feature Engineering for Machine Learning in Natural Language Processing

Search
Co-authors