Curriculum Learning Based on Reward Sparseness for Deep Reinforcement Learning of Task Completion Dialogue Management
Proceedings of the 2018 EMNLP Workshop SCAI: The 2nd International Workshop on Search-Oriented Conversational AI
Learning from sparse and delayed reward is a central issue in reinforcement learning. In this paper, to tackle reward sparseness problem of task oriented dialogue management, we propose a curriculum based approach on the number of slots of user goals. This curriculum makes it possible to learn dialogue management for sets of user goals with large number of slots. We also propose a dialogue policy based on progressive neural networks whose modules with parameters are appended with previous parameters fixed as the curriculum proceeds, and this policy improves performances over the one with single set of parameters.