Claudia Hauff


2020

pdf bib
Slice-Aware Neural Ranking
Gustavo Penha | Claudia Hauff
Proceedings of the 5th International Workshop on Search-Oriented Conversational AI (SCAI)

Understanding when and why neural ranking models fail for an IR task via error analysis is an important part of the research cycle. Here we focus on the challenges of (i) identifying categories of difficult instances (a pair of question and response candidates) for which a neural ranker is ineffective and (ii) improving neural ranking for such instances. To address both challenges we resort to slice-based learning for which the goal is to improve effectiveness of neural models for slices (subsets) of data. We address challenge (i) by proposing different slicing functions (SFs) that select slices of the dataset—based on prior work we heuristically capture different failures of neural rankers. Then, for challenge (ii) we adapt a neural ranking model to learn slice-aware representations, i.e. the adapted model learns to represent the question and responses differently based on the model’s prediction of which slices they belong to. Our experimental results (the source code and data are available at https://github.com/Guzpenha/slice_based_learning) across three different ranking tasks and four corpora show that slice-based learning improves the effectiveness by an average of 2% over a neural ranker that is not slice-aware.

2018

pdf bib
Feature Engineering for Second Language Acquisition Modeling
Guanliang Chen | Claudia Hauff | Geert-Jan Houben
Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications

Knowledge tracing serves as a keystone in delivering personalized education. However, few works attempted to model students’ knowledge state in the setting of Second Language Acquisition. The Duolingo Shared Task on Second Language Acquisition Modeling provides students’ trace data that we extensively analyze and engineer features from for the task of predicting whether a student will correctly solve a vocabulary exercise. Our analyses of students’ learning traces reveal that factors like exercise format and engagement impact their exercise performance to a large extent. Overall, we extracted 23 different features as input to a Gradient Tree Boosting framework, which resulted in an AUC score of between 0.80 and 0.82 on the official test set.

2015

pdf bib
#SupportTheCause: Identifying Motivations to Participate in Online Health Campaigns
Dong Nguyen | Tijs van den Broek | Claudia Hauff | Djoerd Hiemstra | Michel Ehrenhard
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
On the Impact of Twitter-based Health Campaigns: A Cross-Country Analysis of Movember
Nugroho Dwi Prasetyo | Claudia Hauff | Dong Nguyen | Tijs van den Broek | Djoerd Hiemstra
Proceedings of the Sixth International Workshop on Health Text Mining and Information Analysis