Timezone: »

Discovering and Achieving Goals with World Models
Russell Mendonca · Oleh Rybkin · Kostas Daniilidis · Danijar Hafner · Deepak Pathak

How can an artificial agent learn to solve a wide range of tasks in a complex visual environment in the absence of external supervision? We decompose this question into two problems, global exploration of the environment and learning to reliably reach situations found during exploration. We introduce the Explore Achieve Network (ExaNet), a unified solution to these by learning a world model from the high-dimensional images and using it to train an explorer and an achiever policy from imagined trajectories. Unlike prior methods that explore by reaching previously visited states, our explorer plans to discover unseen surprising states through foresight, which are then used as diverse targets for the achiever. After the unsupervised phase, ExaNet solves tasks specified by goal images without any additional learning. We introduce a challenging benchmark spanning across four standard robotic manipulation and locomotion domains with a total of over 40 test tasks. Our agent substantially outperforms previous approaches to unsupervised goal reaching and achieves goals that require interacting with multiple objects in sequence. Finally, to demonstrate the scalability and generality of our approach, we train a single general agent across four distinct environments. For videos, see https://sites.google.com/view/exanet/home.

Author Information

Russell Mendonca (Carnegie Mellon University)
Oleh Rybkin (University of Pennsylvania)

Oleg is a Ph.D. student in the GRASP laboratory at the University of Pennsylvania advised by Kostas Daniilidis. He received his Bachelor's degree from Czech Technical University in Prague. He is interested in deep learning and computer vision, and, more specifically, on using deep predictive models to discover semantic structure in video as well as applications of these models for planning. Prior to his Ph.D. studies, he worked on camera geometry as an undergraduate researcher advised by Tomas Pajdla. He was a visiting student researcher at INRIA advised by Josef Sivic, Tokyo Institute of Technology advised by Akihiko Torii, and UC Berkeley advised by Sergey Levine.

Kostas Daniilidis (University of Pennsylvania)
Danijar Hafner (Google Brain & University of Toronto)
Deepak Pathak (CMU, FAIR)

More from the Same Authors