Poster
in
Workshop: ICML 2021 Workshop on Unsupervised Reinforcement Learning
Discovering and Achieving Goals with World Models
Russell Mendonca · Oleh Rybkin · Kostas Daniilidis · Danijar Hafner · Deepak Pathak
How can an artificial agent learn to solve a wide range of tasks in a complex visual environment in the absence of external supervision? We decompose this question into two problems, global exploration of the environment and learning to reliably reach situations found during exploration. We introduce the Explore Achieve Network (ExaNet), a unified solution to these by learning a world model from the high-dimensional images and using it to train an explorer and an achiever policy from imagined trajectories. Unlike prior methods that explore by reaching previously visited states, our explorer plans to discover unseen surprising states through foresight, which are then used as diverse targets for the achiever. After the unsupervised phase, ExaNet solves tasks specified by goal images without any additional learning. We introduce a challenging benchmark spanning across four standard robotic manipulation and locomotion domains with a total of over 40 test tasks. Our agent substantially outperforms previous approaches to unsupervised goal reaching and achieves goals that require interacting with multiple objects in sequence. Finally, to demonstrate the scalability and generality of our approach, we train a single general agent across four distinct environments. For videos, see https://sites.google.com/view/exanet/home.