Maarten de Rijke

Also published as: Maarten De Rijke


2020

pdf bib
WN-Salience: A Corpus of News Articles with Entity Salience Annotations
Chuan Wu | Evangelos Kanoulas | Maarten de Rijke | Wei Lu
Proceedings of the 12th Language Resources and Evaluation Conference

Entities can be found in various text genres, ranging from tweets and web pages to user queries submitted to web search engines. Existing research either considers all entities in the text equally important, or heuristics are used to measure their salience. We believe that a key reason for the relatively limited work on entity salience is the lack of appropriate datasets. To support research on entity salience, we present a new dataset, the WikiNews Salience dataset (WN-Salience), which can be used to benchmark tasks such as entity salience detection and salient entity linking. WN-Salience is built on top of Wikinews, a Wikimedia project whose mission is to present reliable news articles. Entities in Wikinews articles are identified by the authors of the articles and are linked to Wikinews categories when they are salient or to Wikipedia pages otherwise. The dataset is built automatically, and consists of approximately 7,000 news articles, and 90,000 in-text entity annotations. We compare the WN-Salience dataset against existing datasets on the task and analyze their differences. Furthermore, we conduct experiments on entity salience detection; the results demonstrate that WN-Salience is a challenging testbed that is complementary to existing ones.

pdf bib
Guided Dialogue Policy Learning without Adversarial Learning in the Loop
Ziming Li | Sungjin Lee | Baolin Peng | Jinchao Li | Julia Kiseleva | Maarten de Rijke | Shahin Shayandeh | Jianfeng Gao
Findings of the Association for Computational Linguistics: EMNLP 2020

Reinforcement learning methods have emerged as a popular choice for training an efficient and effective dialogue policy. However, these methods suffer from sparse and unstable reward signals returned by a user simulator only when a dialogue finishes. Besides, the reward signal is manually designed by human experts, which requires domain knowledge. Recently, a number of adversarial learning methods have been proposed to learn the reward function together with the dialogue policy. However, to alternatively update the dialogue policy and the reward model on the fly, we are limited to policy-gradient-based algorithms, such as REINFORCE and PPO. Moreover, the alternating training of a dialogue agent and the reward model can easily get stuck in local optima or result in mode collapse. To overcome the listed issues, we propose to decompose the adversarial training into two steps. First, we train the discriminator with an auxiliary dialogue generator and then incorporate a derived reward model into a common reinforcement learning method to guide the dialogue policy learning. This approach is applicable to both on-policy and off-policy reinforcement learning methods. Based on our extensive experimentation, we can conclude the proposed method: (1) achieves a remarkable task success rate using both on-policy and off-policy reinforcement learning methods; and (2) has potential to transfer knowledge from existing domains to a new domain.

pdf bib
Rethinking Supervised Learning and Reinforcement Learning in Task-Oriented Dialogue Systems
Ziming Li | Julia Kiseleva | Maarten de Rijke
Findings of the Association for Computational Linguistics: EMNLP 2020

Dialogue policy learning for task-oriented dialogue systems has enjoyed great progress recently mostly through employing reinforcement learning methods. However, these approaches have become very sophisticated. It is time to re-evaluate it. Are we really making progress developing dialogue agents only based on reinforcement learning? We demonstrate how (1) traditional supervised learning together with (2) a simulator-free adversarial learning method can be used to achieve performance comparable to state-of-the-art reinforcement learning-based methods. First, we introduce a simple dialogue action decoder to predict the appropriate actions. Then, the traditional multi-label classification solution for dialogue policy learning is extended by adding dense layers to improve the dialogue agent performance. Finally, we employ the Gumbel-Softmax estimator to alternatively train the dialogue agent and the dialogue reward model without using reinforcement learning. Based on our extensive experimentation, we can conclude the proposed methods can achieve more stable and higher performance with fewer efforts, such as the domain knowledge required to design a user simulator and the intractable parameter tuning in reinforcement learning. Our main goal is not to beat RL with supervised learning, but to demonstrate the value of rethinking the role of reinforcement learning and supervised learning in optimizing task-oriented dialogue systems.

2018

pdf bib
Why are Sequence-to-Sequence Models So Dull? Understanding the Low-Diversity Problem of Chatbots
Shaojie Jiang | Maarten de Rijke
Proceedings of the 2018 EMNLP Workshop SCAI: The 2nd International Workshop on Search-Oriented Conversational AI

Diversity is a long-studied topic in information retrieval that usually refers to the requirement that retrieved results should be non-repetitive and cover different aspects. In a conversational setting, an additional dimension of diversity matters: an engaging response generation system should be able to output responses that are diverse and interesting. Sequence-to-sequence (Seq2Seq) models have been shown to be very effective for response generation. However, dialogue responses generated by Seq2Seq models tend to have low diversity. In this paper, we review known sources and existing approaches to this low-diversity problem. We also identify a source of low diversity that has been little studied so far, namely model over-confidence. We sketch several directions for tackling model over-confidence and, hence, the low-diversity problem, including confidence penalties and label smoothing.

2016

pdf bib
Siamese CBOW: Optimizing Word Embeddings for Sentence Representations
Tom Kenter | Alexey Borisov | Maarten de Rijke
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

2015

pdf bib
Learning to Explain Entity Relationships in Knowledge Graphs
Nikos Voskarides | Edgar Meij | Manos Tsagkias | Maarten de Rijke | Wouter Weerkamp
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

2014

pdf bib
Prior-informed Distant Supervision for Temporal Evidence Classification
Ridho Reinanda | Maarten de Rijke
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

2010

pdf bib
Mining User Experiences from Online Forums: An Exploration
Valentin Jijkoun | Wouter Weerkamp | Maarten de Rijke | Paul Ackermans | Gijs Geleijnse
Proceedings of the NAACL HLT 2010 Workshop on Computational Linguistics in a World of Social Media

pdf bib
Generating Focused Topic-Specific Sentiment Lexicons
Valentin Jijkoun | Maarten de Rijke | Wouter Weerkamp
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics

2009

pdf bib
A Generative Blog Post Retrieval Model that Uses Query Expansion based on External Collections
Wouter Weerkamp | Krisztian Balog | Maarten de Rijke
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP

2008

pdf bib
Credibility Improves Topical Blog Post Retrieval
Wouter Weerkamp | Maarten de Rijke
Proceedings of ACL-08: HLT

2007

pdf bib
A Cascaded Machine Learning Approach to Interpreting Temporal Expressions
David Ahn | Joris van Rantwijk | Maarten de Rijke
Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference

pdf bib
Learning to Transform Linguistic Graphs
Valentin Jijkoun | Maarten de Rijke
Proceedings of the Second Workshop on TextGraphs: Graph-Based Algorithms for Natural Language Processing

pdf bib
UVA: Language Modeling Techniques for Web People Search
Krisztian Balog | Leif Azzopardi | Maarten de Rijke
Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval-2007)

2006

pdf bib
The Multilingual Question Answering Track at CLEF
Bernardo Magnini | Danilo Giampiccolo | Lili Aunimo | Christelle Ayache | Petya Osenova | Anselmo Peñas | Maarten de Rijke | Bogdan Sacaleanu | Diana Santos | Richard Sutcliffe
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

This paper presents an overview of the Multilingual Question Answering evaluation campaigns which have been organized at CLEF (Cross Language Evaluation Forum) since 2003. Over the years, the competition has registered a steady increment in the number of participants and languages involved. In fact, from the original eight groups which participated in 2003 QA track, the number of competitors in 2005 rose to twenty-four. Also, the performances of the systems have steadily improved, and the average of the best performances in the 2005 saw an increase of 10% with respect to the previous year.

pdf bib
Representing and Querying Multi-dimensional Markup for Question Answering
Wouter Alink | Valentin Jijkoun | David Ahn | Maarten de Rijke | Peter Boncz | Arjen de Vries
Proceedings of the 5th Workshop on NLP and XML (NLPXML-2006): Multi-Dimensional Markup in Natural Language Processing

pdf bib
Learning to Recognize Blogs: A Preliminary Exploration
Erik Elgersma | Maarten de Rijke
Proceedings of the Workshop on NEW TEXT Wikis and blogs and other dynamic text sources

pdf bib
Finding Similar Sentences across Multiple Languages in Wikipedia
Sisay Fissaha Adafre | Maarten de Rijke
Proceedings of the Workshop on NEW TEXT Wikis and blogs and other dynamic text sources

pdf bib
Why Are They Excited? Identifying and Explaining Spikes in Blog Mood Levels
Krisztian Balog | Gilad Mishne | Maarten de Rijke
Demonstrations

2005

pdf bib
Feature Engineering and Post-Processing for Temporal Expression Recognition Using Conditional Random Fields
Sisay Fissaha Adafre | Maarten de Rijke
Proceedings of the ACL Workshop on Feature Engineering for Machine Learning in Natural Language Processing

2004

pdf bib
Enriching the Output of a Parser Using Memory-based Learning
Valentin Jijkoun | Maarten de Rijke
Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04)

pdf bib
Alternative approaches for Generating Bodies of Grammar Rules
Gabriel Infante-Lopez | Maarten de Rijke
Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04)

pdf bib
Information Extraction for Question Answering: Improving Recall Through Syntactic Patterns
Valentin Jijkoun | Jori Mur | Maarten de Rijke
COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics

pdf bib
Comparing the Ambiguity Reduction Abilities of Probabilistic Context-Free Grammars
Gabriel Infante-Lopez | Maarten de Rijke
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

pdf bib
Using WordNet to Measure Semantic Orientations of Adjectives
Jaap Kamps | Maarten Marx | Robert J. Mokken | Maarten de Rijke
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

pdf bib
BioGrapher: Biography Questions as a Restricted Domain Question Answering Task
Oren Tsur | Maarten de Rijke | Khalil Sima’an
Proceedings of the Conference on Question Answering in Restricted Domains

pdf bib
The University of Amsterdam at Senseval-3: Semantic roles and Logic forms
David Ahn | Sisay Fissaha | Valentin Jijkoun | Maarten De Rijke
Proceedings of SENSEVAL-3, the Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text