Skip to yearly menu bar Skip to main content


Poster
in
Workshop: ICML 2021 Workshop on Unsupervised Reinforcement Learning

Decoupling Exploration and Exploitation in Reinforcement Learning

Lukas Schäfer · Filippos Christianos · Josiah Hanna · Stefano V. Albrecht


Abstract:

Intrinsic rewards are commonly applied to improve exploration in reinforcement learning. However, these approaches suffer from non-stationary reward shaping and strong dependency on hyperparameters. In this work, we propose Decoupled RL (DeRL) which trains separate policies for exploration and exploitation. DeRL can be applied with on-policy and off-policy RL algorithms. We evaluate DeRL algorithms in two exploration-focused environments with five types of intrinsic rewards. We show that DeRL can be more robust to scaling of intrinsic rewards and converge to the same evaluation returns than intrinsically motivated baselines in fewer interactions.

Chat is not available.