Poster
in
Workshop: ICML 2021 Workshop on Unsupervised Reinforcement Learning
Decoupling Exploration and Exploitation in Reinforcement Learning
Lukas Schäfer · Filippos Christianos · Josiah Hanna · Stefano V. Albrecht
Abstract:
Intrinsic rewards are commonly applied to improve exploration in reinforcement learning. However, these approaches suffer from non-stationary reward shaping and strong dependency on hyperparameters. In this work, we propose Decoupled RL (DeRL) which trains separate policies for exploration and exploitation. DeRL can be applied with on-policy and off-policy RL algorithms. We evaluate DeRL algorithms in two exploration-focused environments with five types of intrinsic rewards. We show that DeRL can be more robust to scaling of intrinsic rewards and converge to the same evaluation returns than intrinsically motivated baselines in fewer interactions.
Chat is not available.