David Schlangen


2020

pdf bib
From “Before” to “After”: Generating Natural Language Instructions from Image Pairs in a Simple Visual Domain
Robin Rojowiec | Jana Götze | Philipp Sadler | Henrik Voigt | Sina Zarrieß | David Schlangen
Proceedings of the 13th International Conference on Natural Language Generation

While certain types of instructions can be com-pactly expressed via images, there are situations where one might want to verbalise them, for example when directing someone. We investigate the task of Instruction Generation from Before/After Image Pairs which is to derive from images an instruction for effecting the implied change. For this, we make use of prior work on instruction following in a visual environment. We take an existing dataset, the BLOCKS data collected by Bisk et al. (2016) and investigate whether it is suitable for training an instruction generator as well. We find that it is, and investigate several simple baselines, taking these from the related task of image captioning. Through a series of experiments that simplify the task (by making image processing easier or completely side-stepping it; and by creating template-based targeted instructions), we investigate areas for improvement. We find that captioning models get some way towards solving the task, but have some difficulty with it, and future improvements must lie in the way the change is detected in the instruction.

pdf bib
A Corpus of Controlled Opinionated and Knowledgeable Movie Discussions for Training Neural Conversation Models
Fabian Galetzka | Chukwuemeka Uchenna Eneh | David Schlangen
Proceedings of the 12th Language Resources and Evaluation Conference

Fully data driven Chatbots for non-goal oriented dialogues are known to suffer from inconsistent behaviour across their turns, stemming from a general difficulty in controlling parameters like their assumed background personality and knowledge of facts. One reason for this is the relative lack of labeled data from which personality consistency and fact usage could be learned together with dialogue behaviour. To address this, we introduce a new labeled dialogue dataset in the domain of movie discussions, where every dialogue is based on pre-specified facts and opinions. We thoroughly validate the collected dialogue for adherence of the participants to their given fact and opinion profile, and find that the general quality in this respect is high. This process also gives us an additional layer of annotation that is potentially useful for training models. We introduce as a baseline an end-to-end trained self-attention decoder model trained on this data and show that it is able to generate opinionated responses that are judged to be natural and knowledgeable and show attentiveness.

pdf bib
Incremental Processing in the Age of Non-Incremental Encoders: An Empirical Assessment of Bidirectional Models for Incremental NLU
Brielen Madureira | David Schlangen
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

While humans process language incrementally, the best language encoders currently used in NLP do not. Both bidirectional LSTMs and Transformers assume that the sequence that is to be encoded is available in full, to be processed either forwards and backwards (BiLSTMs) or as a whole (Transformers). We investigate how they behave under incremental interfaces, when partial output must be provided based on partial input seen up to a certain time step, which may happen in interactive systems. We test five models on various NLU datasets and compare their performance using three incremental evaluation metrics. The results support the possibility of using bidirectional encoders in incremental mode while retaining most of their non-incremental quality. The “omni-directional” BERT model, which achieves better non-incremental performance, is impacted more by the incremental access. This can be alleviated by adapting the training regime (truncated training), or the testing procedure, by delaying the output until some right context is available or by incorporating hypothetical right contexts generated by a language model like GPT-2.

pdf bib
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations
Qun Liu | David Schlangen
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations

2019

pdf bib
Know What You Don’t Know: Modeling a Pragmatic Speaker that Refers to Objects of Unknown Categories
Sina Zarrieß | David Schlangen
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Zero-shot learning in Language & Vision is the task of correctly labelling (or naming) objects of novel categories. Another strand of work in L&V aims at pragmatically informative rather than “correct” object descriptions, e.g. in reference games. We combine these lines of research and model zero-shot reference games, where a speaker needs to successfully refer to a novel object in an image. Inspired by models of “rational speech acts”, we extend a neural generator to become a pragmatic speaker reasoning about uncertain object categories. As a result of this reasoning, the generator produces fewer nouns and names of distractor categories as compared to a literal speaker. We show that this conversational strategy for dealing with novel objects often improves communicative success, in terms of resolution accuracy of an automatic listener.

pdf bib
Natural Language Semantics With Pictures: Some Language & Vision Datasets and Potential Uses for Computational Semantics
David Schlangen
Proceedings of the 13th International Conference on Computational Semantics - Long Papers

Propelling, and propelled by, the “deep learning revolution”, recent years have seen the introduction of ever larger corpora of images annotated with natural language expressions. We survey some of these corpora, taking a perspective that reverses the usual directionality, as it were, by viewing the images as semantic annotation of the natural language expressions. We discuss datasets that can be derived from the corpora, and tasks of potential interest for computational semanticists that can be defined on those. In this, we make use of relations provided by the corpora (namely, the link between expression and image, and that between two expressions linked to the same image) and relations that we can add (similarity relations between expressions, or between images). Specifically, we show that in this way we can create data that can be used to learn and evaluate lexical and compositional grounded semantics, and we show that the “linked to same image” relation tracks a semantic implication relation that is recognisable to annotators even in the absence of the linking image as evidence. Finally, as an example of possible benefits of this approach, we show that an exemplar-model-based approach to implication beats a (simple) distributional space-based one on some derived datasets, while lending itself to explainability.

pdf bib
From Explainability to Explanation: Using a Dialogue Setting to Elicit Annotations with Justifications
Nazia Attari | Martin Heckmann | David Schlangen
Proceedings of the 20th Annual SIGdial Meeting on Discourse and Dialogue

Despite recent attempts in the field of explainable AI to go beyond black box prediction models, typically already the training data for supervised machine learning is collected in a manner that treats the annotator as a “black box”, the internal workings of which remains unobserved. We present an annotation method where a task is given to a pair of annotators who collaborate on finding the best response. With this we want to shed light on the questions if the collaboration increases the quality of the responses and if this “thinking together” provides useful information in itself, as it at least partially reveals their reasoning steps. Furthermore, we expect that this setting puts the focus on explanation as a linguistic act, vs. explainability as a property of models. In a crowd-sourcing experiment, we investigated three different annotation tasks, each in a collaborative dialogical (two annotators) and monological (one annotator) setting. Our results indicate that our experiment elicits collaboration and that this collaboration increases the response accuracy. We see large differences in the annotators’ behavior depending on the task. Similarly, we also observe that the dialog patterns emerging from the collaboration vary significantly with the task.

pdf bib
Tell Me More: A Dataset of Visual Scene Description Sequences
Nikolai Ilinykh | Sina Zarrieß | David Schlangen
Proceedings of the 12th International Conference on Natural Language Generation

We present a dataset consisting of what we call image description sequences, which are multi-sentence descriptions of the contents of an image. These descriptions were collected in a pseudo-interactive setting, where the describer was told to describe the given image to a listener who needs to identify the image within a set of images, and who successively asks for more information. As we show, this setup produced nicely structured data that, we think, will be useful for learning models capable of planning and realising such description discourses.

pdf bib
Can Neural Image Captioning be Controlled via Forced Attention?
Philipp Sadler | Tatjana Scheffler | David Schlangen
Proceedings of the 12th International Conference on Natural Language Generation

Learned dynamic weighting of the conditioning signal (attention) has been shown to improve neural language generation in a variety of settings. The weights applied when generating a particular output sequence have also been viewed as providing a potentially explanatory insight in the internal workings of the generator. In this paper, we reverse the direction of this connection and ask whether through the control of the attention of the model we can control its output. Specifically, we take a standard neural image captioning model that uses attention, and fix the attention to predetermined areas in the image. We evaluate whether the resulting output is more likely to mention the class of the object in that area than the normally generated caption. We introduce three effective methods to control the attention and find that these are producing expected results in up to 27.43% of the cases.

2018

pdf bib
A Corpus of Natural Multimodal Spatial Scene Descriptions
Ting Han | David Schlangen
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

pdf bib
The Task Matters: Comparing Image Captioning and Task-Based Dialogical Image Description
Nikolai Ilinykh | Sina Zarrieß | David Schlangen
Proceedings of the 11th International Conference on Natural Language Generation

Image captioning models are typically trained on data that is collected from people who are asked to describe an image, without being given any further task context. As we argue here, this context independence is likely to cause problems for transferring to task settings in which image description is bound by task demands. We demonstrate that careful design of data collection is required to obtain image descriptions which are contextually bounded to a particular meta-level task. As a task, we use MeetUp!, a text-based communication game where two players have the goal of finding each other in a visual environment. To reach this goal, the players need to describe images representing their current location. We analyse a dataset from this domain and show that the nature of image descriptions found in MeetUp! is diverse, dynamic and rich with phenomena that are not present in descriptions obtained through a simple image captioning task, which we ran for comparison.

pdf bib
Decoding Strategies for Neural Referring Expression Generation
Sina Zarrieß | David Schlangen
Proceedings of the 11th International Conference on Natural Language Generation

RNN-based sequence generation is now widely used in NLP and NLG (natural language generation). Most work focusses on how to train RNNs, even though also decoding is not necessarily straightforward: previous work on neural MT found seq2seq models to radically prefer short candidates, and has proposed a number of beam search heuristics to deal with this. In this work, we assess decoding strategies for referring expression generation with neural models. Here, expression length is crucial: output should neither contain too much or too little information, in order to be pragmatically adequate. We find that most beam search heuristics developed for MT do not generalize well to referring expression generation (REG), and do not generally outperform greedy decoding. We observe that beam search heuristics for termination seem to override the model’s knowledge of what a good stopping point is. Therefore, we also explore a recent approach called trainable decoding, which uses a small network to modify the RNN’s hidden state for better decoding results. We find this approach to consistently outperform greedy decoding for REG.

pdf bib
Being data-driven is not enough: Revisiting interactive instruction giving as a challenge for NLG
Sina Zarrieß | David Schlangen
Proceedings of the Workshop on NLG for Human–Robot Interaction

Modeling traditional NLG tasks with data-driven techniques has been a major focus of research in NLG in the past decade. We argue that existing modeling techniques are mostly tailored to textual data and are not sufficient to make NLG technology meet the requirements of agents which target fluid interaction and collaboration in the real world. We revisit interactive instruction giving as a challenge for datadriven NLG and, based on insights from previous GIVE challenges, propose that instruction giving should be addressed in a setting that involves visual grounding and spoken language. These basic design decisions will require NLG frameworks that are capable of monitoring their environment as well as timing and revising their verbal output. We believe that these are core capabilities for making NLG technology transferrable to interactive systems.

2017

pdf bib
Obtaining referential word meanings from visual and distributional information: Experiments on object naming
Sina Zarrieß | David Schlangen
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

We investigate object naming, which is an important sub-task of referring expression generation on real-world images. As opposed to mutually exclusive labels used in object recognition, object names are more flexible, subject to communicative preferences and semantically related to each other. Therefore, we investigate models of referential word meaning that link visual to lexical information which we assume to be given through distributional word embeddings. We present a model that learns individual predictors for object names that link visual and distributional aspects of word meaning during training. We show that this is particularly beneficial for zero-shot learning, as compared to projecting visual objects directly into the distributional space. In a standard object naming task, we find that different ways of combining lexical and visual information achieve very similar performance, though experiments on model combination suggest that they capture complementary aspects of referential meaning.

pdf bib
Joint, Incremental Disfluency Detection and Utterance Segmentation from Speech
Julian Hough | David Schlangen
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers

We present the joint task of incremental disfluency detection and utterance segmentation and a simple deep learning system which performs it on transcripts and ASR results. We show how the constraints of the two tasks interact. Our joint-task system outperforms the equivalent individual task systems, provides competitive results and is suitable for future use in conversation agents in the psychiatric domain.

pdf bib
Is this a Child, a Girl or a Car? Exploring the Contribution of Distributional Similarity to Learning Referential Word Meanings
Sina Zarrieß | David Schlangen
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers

There has recently been a lot of work trying to use images of referents of words for improving vector space meaning representations derived from text. We investigate the opposite direction, as it were, trying to improve visual word predictors that identify objects in images, by exploiting distributional similarity information during training. We show that for certain words (such as entry-level nouns or hypernyms), we can indeed learn better referential word meanings by taking into account their semantic similarity to other words. For other words, there is no or even a detrimental effect, compared to a learning setup that presents even semantically related objects as negative instances.

pdf bib
Grounding Language by Continuous Observation of Instruction Following
Ting Han | David Schlangen
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers

Grounded semantics is typically learnt from utterance-level meaning representations (e.g., successful database retrievals, denoted objects in images, moves in a game). We explore learning word and utterance meanings by continuous observation of the actions of an instruction follower (IF). While an instruction giver (IG) provided a verbal description of a configuration of objects, IF recreated it using a GUI. Aligning these GUI actions to sub-utterance chunks allows a simple maximum entropy model to associate them as chunk meaning better than just providing it with the utterance-final configuration. This shows that semantics useful for incremental (word-by-word) application, as required in natural dialogue, might also be better acquired from incremental settings.

pdf bib
Natural Language Informs the Interpretation of Iconic Gestures: A Computational Approach
Ting Han | Julian Hough | David Schlangen
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

When giving descriptions, speakers often signify object shape or size with hand gestures. Such so-called ‘iconic’ gestures represent their meaning through their relevance to referents in the verbal content, rather than having a conventional form. The gesture form on its own is often ambiguous, and the aspect of the referent that it highlights is constrained by what the language makes salient. We show how the verbal content guides gesture interpretation through a computational model that frames the task as a multi-label classification task that maps multimodal utterances to semantic categories, using annotated human-human data.

pdf bib
Draw and Tell: Multimodal Descriptions Outperform Verbal- or Sketch-Only Descriptions in an Image Retrieval Task
Ting Han | David Schlangen
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

While language conveys meaning largely symbolically, actual communication acts typically contain iconic elements as well: People gesture while they speak, or may even draw sketches while explaining something. Image retrieval prima facie seems like a task that could profit from combined symbolic and iconic reference, but it is typically set up to work either from language only, or via (iconic) sketches with no verbal contribution. Using a model of grounded language semantics and a model of sketch-to-image mapping, we show that adding even very reduced iconic information to a verbal image description improves recall. Verbal descriptions paired with fully detailed sketches still perform better than these sketches alone. We see these results as supporting the assumption that natural user interfaces should respond to multimodal input, where possible, rather than just language alone.

pdf bib
Refer-iTTS: A System for Referring in Spoken Installments to Objects in Real-World Images
Sina Zarrieß | M. Soledad López Gambino | David Schlangen
Proceedings of the 10th International Conference on Natural Language Generation

Current referring expression generation systems mostly deliver their output as one-shot, written expressions. We present on-going work on incremental generation of spoken expressions referring to objects in real-world images. This approach extends upon previous work using the words-as-classifier model for generation. We implement this generator in an incremental dialogue processing framework such that we can exploit an existing interface to incremental text-to-speech synthesis. Our system generates and synthesizes referring expressions while continuously observing non-verbal user reactions.

pdf bib
Beyond On-hold Messages: Conversational Time-buying in Task-oriented Dialogue
Soledad López Gambino | Sina Zarrieß | David Schlangen
Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue

A common convention in graphical user interfaces is to indicate a “wait state”, for example while a program is preparing a response, through a changed cursor state or a progress bar. What should the analogue be in a spoken conversational system? To address this question, we set up an experiment in which a human information provider (IP) was given their information only in a delayed and incremental manner, which systematically created situations where the IP had the turn but could not provide task-related information. Our data analysis shows that 1) IPs bridge the gap until they can provide information by re-purposing a whole variety of task- and grounding-related communicative actions (e.g. echoing the user’s request, signaling understanding, asserting partially relevant information), rather than being silent or explicitly asking for time (e.g. “please wait”), and that 2) IPs combined these actions productively to ensure an ongoing conversation. These results, we argue, indicate that natural conversational interfaces should also be able to manage their time flexibly using a variety of conversational resources.

pdf bib
Deriving continous grounded meaning representations from referentially structured multimodal contexts
Sina Zarrieß | David Schlangen
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

Corpora of referring expressions paired with their visual referents are a good source for learning word meanings directly grounded in visual representations. Here, we explore additional ways of extracting from them word representations linked to multi-modal context: through expressions that refer to the same object, and through expressions that refer to different objects in the same scene. We show that continuous meaning representations derived from these contexts capture complementary aspects of similarity, , even if not outperforming textual embeddings trained on very large amounts of raw text when tested on standard similarity benchmarks. We propose a new task for evaluating grounded meaning representations—detection of potentially co-referential phrases—and show that it requires precise denotational representations of attribute meanings, which our method provides.

2016

pdf bib
PentoRef: A Corpus of Spoken References in Task-oriented Dialogues
Sina Zarrieß | Julian Hough | Casey Kennington | Ramesh Manuvinakurike | David DeVault | Raquel Fernández | David Schlangen
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

PentoRef is a corpus of task-oriented dialogues collected in systematically manipulated settings. The corpus is multilingual, with English and German sections, and overall comprises more than 20000 utterances. The dialogues are fully transcribed and annotated with referring expressions mapped to objects in corresponding visual scenes, which makes the corpus a rich resource for research on spoken referring expressions in generation and resolution. The corpus includes several sub-corpora that correspond to different dialogue situations where parameters related to interactivity, visual access, and verbal channel have been manipulated in systematic ways. The corpus thus lends itself to very targeted studies of reference in spontaneous dialogue.

pdf bib
DUEL: A Multi-lingual Multimodal Dialogue Corpus for Disfluency, Exclamations and Laughter
Julian Hough | Ye Tian | Laura de Ruiter | Simon Betz | Spyros Kousidis | David Schlangen | Jonathan Ginzburg
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

We present the DUEL corpus, consisting of 24 hours of natural, face-to-face, loosely task-directed dialogue in German, French and Mandarin Chinese. The corpus is uniquely positioned as a cross-linguistic, multimodal dialogue resource controlled for domain. DUEL includes audio, video and body tracking data and is transcribed and annotated for disfluency, laughter and exclamations.

pdf bib
How to Address Smart Homes with a Social Robot? A Multi-modal Corpus of User Interactions with an Intelligent Environment
Patrick Holthaus | Christian Leichsenring | Jasmin Bernotat | Viktor Richter | Marian Pohling | Birte Carlmeyer | Norman Köster | Sebastian Meyer zu Borgsen | René Zorn | Birte Schiffhauer | Kai Frederic Engelmann | Florian Lier | Simon Schulz | Philipp Cimiano | Friederike Eyssel | Thomas Hermann | Franz Kummert | David Schlangen | Sven Wachsmuth | Petra Wagner | Britta Wrede | Sebastian Wrede
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

In order to explore intuitive verbal and non-verbal interfaces in smart environments we recorded user interactions with an intelligent apartment. Besides offering various interactive capabilities itself, the apartment is also inhabited by a social robot that is available as a humanoid interface. This paper presents a multi-modal corpus that contains goal-directed actions of naive users in attempts to solve a number of predefined tasks. Alongside audio and video recordings, our data-set consists of large amount of temporally aligned sensory data and system behavior provided by the environment and its interactive components. Non-verbal system responses such as changes in light or display contents, as well as robot and apartment utterances and gestures serve as a rich basis for later in-depth analysis. Manual annotations provide further information about meta data like the current course of study and user behavior including the incorporated modality, all literal utterances, language features, emotional expressions, foci of attention, and addressees.

pdf bib
Easy Things First: Installments Improve Referring Expression Generation for Objects in Photographs
Sina Zarrieß | David Schlangen
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Resolving References to Objects in Photographs using the Words-As-Classifiers Model
David Schlangen | Sina Zarrieß | Casey Kennington
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Real-Time Understanding of Complex Discriminative Scene Descriptions
Ramesh Manuvinakurike | Casey Kennington | David DeVault | David Schlangen
Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue

pdf bib
Supporting Spoken Assistant Systems with a Graphical User Interface that Signals Incremental Understanding and Prediction State
Casey Kennington | David Schlangen
Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue

pdf bib
Toward incremental dialogue act segmentation in fast-paced interactive dialogue systems
Ramesh Manuvinakurike | Maike Paetzel | Cheng Qu | David Schlangen | David DeVault
Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue

pdf bib
Investigating Fluidity for Human-Robot Interaction with Real-time, Real-world Grounding Strategies
Julian Hough | David Schlangen
Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue

pdf bib
Towards Generating Colour Terms for Referents in Photographs: Prefer the Expected or the Unexpected?
Sina Zarrieß | David Schlangen
Proceedings of the 9th International Natural Language Generation conference

2015

pdf bib
A Discriminative Model for Perceptually-Grounded Incremental Reference Resolution
Casey Kennington | Livia Dia | David Schlangen
Proceedings of the 11th International Conference on Computational Semantics

pdf bib
Incremental Semantics for Dialogue Processing: Requirements, and a Comparison of Two Approaches
Julian Hough | Casey Kennington | David Schlangen | Jonathan Ginzburg
Proceedings of the 11th International Conference on Computational Semantics

pdf bib
Reading Times Predict the Quality of Generated Text Above and Beyond Human Ratings
Sina Zarrieß | Sebastian Loth | David Schlangen
Proceedings of the 15th European Workshop on Natural Language Generation (ENLG)

pdf bib
Incrementally Tracking Reference in Human/Human Dialogue Using Linguistic and Extra-Linguistic Information
Casey Kennington | Ryu Iida | Takenobu Tokunaga | David Schlangen
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Simple Learning and Compositional Application of Perceptually Grounded Word Meanings for Incremental Reference Resolution
Casey Kennington | David Schlangen
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

2014

pdf bib
Situationally Aware In-Car Information Presentation Using Incremental Speech Generation: Safer, and More Effective
Spyros Kousidis | Casey Kennington | Timo Baumann | Hendrik Buschmeier | Stefan Kopp | David Schlangen
Proceedings of the EACL 2014 Workshop on Dialogue in Motion

pdf bib
InproTKs: A Toolkit for Incremental Situated Processing
Casey Kennington | Spyros Kousidis | David Schlangen
Proceedings of the 15th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL)

pdf bib
Situated Incremental Natural Language Understanding using a Multimodal, Linguistically-driven Update Model
Casey Kennington | Spyros Kousidis | David Schlangen
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

2013

pdf bib
Interpreting Situated Dialogue Utterances: an Update Model that Uses Speech, Gaze, and Gesture Information
Casey Kennington | Spyros Kousidis | David Schlangen
Proceedings of the SIGDIAL 2013 Conference

pdf bib
Open-ended, Extensible System Utterances Are Preferred, Even If They Require Filled Pauses
Timo Baumann | David Schlangen
Proceedings of the SIGDIAL 2013 Conference

pdf bib
Investigating speaker gaze and pointing behaviour in human-computer interaction with the mint.tools collection
Spyros Kousidis | Casey Kennington | David Schlangen
Proceedings of the SIGDIAL 2013 Conference

2012

pdf bib
INPRO_iSS: A Component for Just-In-Time Incremental Speech Synthesis
Timo Baumann | David Schlangen
Proceedings of the ACL 2012 System Demonstrations

pdf bib
Combining Incremental Language Generation and Incremental Speech Synthesis for Adaptive Information Presentation
Hendrik Buschmeier | Timo Baumann | Benjamin Dosch | Stefan Kopp | David Schlangen
Proceedings of the 13th Annual Meeting of the Special Interest Group on Discourse and Dialogue

pdf bib
Markov Logic Networks for Situated Incremental Natural Language Understanding
Casey Kennington | David Schlangen
Proceedings of the 13th Annual Meeting of the Special Interest Group on Discourse and Dialogue

pdf bib
The Future of Spoken Dialogue Systems is in their Past: Long-Term Adaptive, Conversational Assistants
David Schlangen
NAACL-HLT Workshop on Future directions and needs in the Spoken Dialog Community: Tools and Data (SDCTD 2012)

pdf bib
The InproTK 2012 release
Timo Baumann | David Schlangen
NAACL-HLT Workshop on Future directions and needs in the Spoken Dialog Community: Tools and Data (SDCTD 2012)

pdf bib
Incremental Construction of Robust but Deep Semantic Representations for Use in Responsive Dialogue Systems
Andreas Peldszus | David Schlangen
Proceedings of the Workshop on Advances in Discourse Analysis and its Computational Aspects

pdf bib
Joint Satisfaction of Syntactic and Pragmatic Constraints Improves Incremental Spoken Language Understanding
Andreas Peldszus | Okko Buß | Timo Baumann | David Schlangen
Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics

2011

pdf bib
Predicting the Micro-Timing of User Input for an Incremental Spoken Dialogue System that Completes a User’s Ongoing Turn
Timo Baumann | David Schlangen
Proceedings of the SIGDIAL 2011 Conference

2010

pdf bib
Comparing Local and Sequential Models for Statistical Incremental Natural Language Understanding
Silvan Heintze | Timo Baumann | David Schlangen
Proceedings of the SIGDIAL 2010 Conference

pdf bib
Middleware for Incremental Processing in Conversational Agents
David Schlangen | Timo Baumann | Hendrik Buschmeier | Okko Buß | Stefan Kopp | Gabriel Skantze | Ramin Yaghoubzadeh
Proceedings of the SIGDIAL 2010 Conference

pdf bib
Collaborating on Utterances with a Spoken Dialogue System Using an ISU-based Approach to Incremental Dialogue Management
Okko Buß | Timo Baumann | David Schlangen
Proceedings of the SIGDIAL 2010 Conference

2009

pdf bib
RUBISC - a Robust Unification-Based Incremental Semantic Chunker
Michaela Atterer | David Schlangen
Proceedings of SRSL 2009, the 2nd Workshop on Semantic Representation of Spoken Language

pdf bib
Incremental Reference Resolution: The Task, Metrics for Evaluation, and a Bayesian Filtering Model that is Sensitive to Disfluencies
David Schlangen | Timo Baumann | Michaela Atterer
Proceedings of the SIGDIAL 2009 Conference

pdf bib
TELIDA: A Package for Manipulation and Visualization of Timed Linguistic Data
Titus von der Malsburg | Timo Baumann | David Schlangen
Proceedings of the SIGDIAL 2009 Conference

pdf bib
A General, Abstract Model of Incremental Dialogue Processing
David Schlangen | Gabriel Skantze
Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009)

pdf bib
Incremental Dialogue Processing in a Micro-Domain
Gabriel Skantze | David Schlangen
Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009)

pdf bib
Assessing and Improving the Performance of Speech Recognition for Incremental Systems
Timo Baumann | Michaela Atterer | David Schlangen
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics

2008

pdf bib
Proceedings of the 9th SIGdial Workshop on Discourse and Dialogue
David Schlangen | Beth Ann Hockey
Proceedings of the 9th SIGdial Workshop on Discourse and Dialogue

pdf bib
A Simple Method for Resolution of Definite Reference in a Shared Visual Context
Alexander Siebert | David Schlangen
Proceedings of the 9th SIGdial Workshop on Discourse and Dialogue

pdf bib
Towards Incremental End-of-Utterance Detection in Dialogue Systems
Michaela Atterer | Timo Baumann | David Schlangen
Coling 2008: Companion volume: Posters

2007

pdf bib
An Implemented Method for Distributed Collection and Assessment of Speech Data
Alexander Siebert | David Schlangen | Raquel Fernández
Proceedings of the 8th SIGdial Workshop on Discourse and Dialogue

pdf bib
Beyond Repair – Testing the Limits of the Conversational Repair System
David Schlangen | Raquel Fernández
Proceedings of the 8th SIGdial Workshop on Discourse and Dialogue

pdf bib
Referring under Restricted Interactivity Conditions
Raquel Fernández | Tatjana Lucht | David Schlangen
Proceedings of the 8th SIGdial Workshop on Discourse and Dialogue

2005

pdf bib
Towards Finding and Fixing Fragments—Using ML to Identify Non-Sentential Utterances and their Antecedents in Multi-Party Dialogue
David Schlangen
Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05)

2004

pdf bib
Feeding OWL: Extracting and Representing the Content of Pathology Reports
David Schlangen | Manfred Stede | Elena Paslaru Bontas
Proceeedings of the Workshop on NLP and XML (NLPXML-2004): RDF/RDFS and OWL in Language Technology

pdf bib
Causes and Strategies for Requesting Clarification in Dialogue
David Schlangen
Proceedings of the 5th SIGdial Workshop on Discourse and Dialogue at HLT-NAACL 2004

2003

pdf bib
The interpretation of non-sentential utterances in dialogue
David Schlangen | Alex Lascarides
Proceedings of the Fourth SIGdial Workshop of Discourse and Dialogue