Oral
Discovering Options for Exploration by Minimizing Cover Time
Yuu Jinnai · Jee Won Park · David Abel · George Konidaris
Abstract:
One of the main challenges in reinforcement learning is on solving tasks with sparse reward. We first show that the difficulty of discovering the rewarding state is bounded by the expected cover time of the underlying random walk induced by a policy. We propose a method to discover options automatically which reduce the cover time so as to speed up the exploration in sparse reward domains. We show empirically that the proposed algorithm successfully reduces the cover time, and improves the performance of the reinforcement learning agents.
Chat is not available.
Successful Page Load