Kordula De Kuthy


pdf bib
Towards automatically generating Questions under Discussion to link information and discourse structure
Kordula De Kuthy | Madeeswaran Kannan | Haemanth Santhi Ponnusamy | Detmar Meurers
Proceedings of the 28th International Conference on Computational Linguistics

Questions under Discussion (QUD; Roberts, 2012) are emerging as a conceptually fruitful approach to spelling out the connection between the information structure of a sentence and the nature of the discourse in which the sentence can function. To make this approach useful for analyzing authentic data, Riester, Brunetti & De Kuthy (2018) presented a discourse annotation framework based on explicit pragmatic principles for determining a QUD for every assertion in a text. De Kuthy et al. (2018) demonstrate that this supports more reliable discourse structure annotation, and Ziai and Meurers (2018) show that based on explicit questions, automatic focus annotation becomes feasible. But both approaches are based on manually specified questions. In this paper, we present an automatic question generation approach to partially automate QUD annotation by generating all potentially relevant questions for a given sentence. While transformation rules can concisely capture the typical question formation process, a rule-based approach is not sufficiently robust for authentic data. We therefore employ the transformation rules to generate a large set of sentence-question-answer triples and train a neural question generation model on them to obtain both systematic question type coverage and robustness.


pdf bib
Annotating Information Structure in Italian: Characteristics and Cross-Linguistic Applicability of a QUD-Based Approach
Kordula De Kuthy | Lisa Brunetti | Marta Berardi
Proceedings of the 13th Linguistic Annotation Workshop

We present a discourse annotation study, in which an annotation method based on Questions under Discussion (QuD) is applied to Italian data. The results of our inter-annotator agreement analysis show that the QUD-based approach, originally spelled out for English and German, can successfully be transferred cross-linguistically, supporting good agreement for the annotation of central information structure notions such as focus and non-at-issueness. Our annotation and interannotator agreement study on Italian authentic data confirms the cross-linguistic applicability of the QuD-based approach.

pdf bib
The Impact of Spelling Correction and Task Context on Short Answer Assessment for Intelligent Tutoring Systems
Ramon Ziai | Florian Nuxoll | Kordula De Kuthy | Björn Rudzewitz | Detmar Meurers
Proceedings of the 8th Workshop on NLP for Computer Assisted Language Learning


pdf bib
QUD-Based Annotation of Discourse Structure and Information Structure: Tool and Evaluation
Kordula De Kuthy | Nils Reiter | Arndt Riester
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

pdf bib
Generating Feedback for English Foreign Language Exercises
Björn Rudzewitz | Ramon Ziai | Kordula De Kuthy | Verena Möller | Florian Nuxoll | Detmar Meurers
Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications

While immediate feedback on learner language is often discussed in the Second Language Acquisition literature (e.g., Mackey 2006), few systems used in real-life educational settings provide helpful, metalinguistic feedback to learners. In this paper, we present a novel approach leveraging task information to generate the expected range of well-formed and ill-formed variability in learner answers along with the required diagnosis and feedback. We combine this offline generation approach with an online component that matches the actual student answers against the pre-computed hypotheses. The results obtained for a set of 33 thousand answers of 7th grade German high school students learning English show that the approach successfully covers frequent answer patterns. At the same time, paraphrases and content errors require a more flexible alignment approach, for which we are planning to complement the method with the CoMiC approach successfully used for the analysis of reading comprehension answers (Meurers et al., 2011).

pdf bib
Feedback Strategies for Form and Meaning in a Real-life Language Tutoring System
Ramon Ziai | Bjoern Rudzewitz | Kordula De Kuthy | Florian Nuxoll | Detmar Meurers
Proceedings of the 7th workshop on NLP for Computer Assisted Language Learning


pdf bib
Developing a web-based workbook for English supporting the interaction of students and teachers
Björn Rudzewitz | Ramon Ziai | Kordula De Kuthy | Detmar Meurers
Proceedings of the joint workshop on NLP for Computer Assisted Language Learning and NLP for Language Acquisition


pdf bib
Focus Annotation of Task-based Data: A Comparison of Expert and Crowd-Sourced Annotation in a Reading Comprehension Corpus
Kordula De Kuthy | Ramon Ziai | Detmar Meurers
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

While the formal pragmatic concepts in information structure, such as the focus of an utterance, are precisely defined in theoretical linguistics and potentially very useful in conceptual and practical terms, it has turned out to be difficult to reliably annotate such notions in corpus data. We present a large-scale focus annotation effort designed to overcome this problem. Our annotation study is based on the tasked-based corpus CREG, which consists of answers to explicitly given reading comprehension questions. We compare focus annotation by trained annotators with a crowd-sourcing setup making use of untrained native speakers. Given the task context and an annotation process incrementally making the question form and answer type explicit, the trained annotators reach substantial agreement for focus annotation. Interestingly, the crowd-sourcing setup also supports high-quality annotation ― for specific subtypes of data. Finally, we turn to the question whether the relevance of focus annotation can be extrinsically evaluated. We show that automatic short-answer assessment significantly improves for focus annotated data. The focus annotated CREG corpus is freely available and constitutes the largest such resource for German.

pdf bib
Focus Annotation of Task-based Data: Establishing the Quality of Crowd Annotation
Kordula De Kuthy | Ramon Ziai | Detmar Meurers
Proceedings of the 10th Linguistic Annotation Workshop held in conjunction with ACL 2016 (LAW-X 2016)

pdf bib
Approximating Givenness in Content Assessment through Distributional Semantics
Ramon Ziai | Kordula De Kuthy | Detmar Meurers
Proceedings of the Fifth Joint Conference on Lexical and Computational Semantics