Poster
Model-Based Active Exploration
Pranav Shyam · Wojciech Jaśkowski · Faustino Gomez
Pacific Ballroom #46
Keywords: [ Active Learning ] [ Bayesian Deep Learning ] [ Deep Reinforcement Learning ] [ Planning and Control ] [ Robotics ]
Efficient exploration is an unsolved problem in Reinforcement Learning which is usually addressed by reactively rewarding the agent for fortuitously encountering novel situations. This paper introduces an efficient {\em active} exploration algorithm, Model-Based Active eXploration (MAX), which uses an ensemble of forward models to plan to observe novel events. This is carried out by optimizing agent behaviour with respect to a measure of novelty derived from the Bayesian perspective of exploration, which is estimated using the disagreement between the futures predicted by the ensemble members. We show empirically that in semi-random discrete environments where directed exploration is critical to make progress, MAX is at least an order of magnitude more efficient than strong baselines. MAX scales to high-dimensional continuous environments where it builds task-agnostic models that can be used for any downstream task.
Live content is unavailable. Log in and register to view live content