Curriculum Learning Based on Reward Sparseness for Deep Reinforcement Learning of Task Completion Dialogue Management

Atsushi Saito


Abstract
Learning from sparse and delayed reward is a central issue in reinforcement learning. In this paper, to tackle reward sparseness problem of task oriented dialogue management, we propose a curriculum based approach on the number of slots of user goals. This curriculum makes it possible to learn dialogue management for sets of user goals with large number of slots. We also propose a dialogue policy based on progressive neural networks whose modules with parameters are appended with previous parameters fixed as the curriculum proceeds, and this policy improves performances over the one with single set of parameters.
Anthology ID:
W18-5707
Volume:
Proceedings of the 2018 EMNLP Workshop SCAI: The 2nd International Workshop on Search-Oriented Conversational AI
Month:
October
Year:
2018
Address:
Brussels, Belgium
Venues:
EMNLP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
46–51
Language:
URL:
https://www.aclweb.org/anthology/W18-5707
DOI:
10.18653/v1/W18-5707
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/W18-5707.pdf