Timezone: »
In visual planning (VP), an agent learns to plan goal-directed behavior from observations of a dynamical system obtained offline, e.g., images obtained from self-supervised robot interaction. Most previous works on VP approached the problem by planning in a learned latent space, resulting in low-quality visual plans, and difficult training algorithms. Here, instead, we propose a simple VP method that plans directly in image space and displays competitive performance. We build on the semi-parametric topological memory (SPTM) method: image samples are treated as nodes in a graph, the graph connectivity is learned from image sequence data, and planning can be performed using conventional graph search methods. We propose two modifications on SPTM. First, we train an energy-based graph connectivity function using contrastive predictive coding that admits stable training. Second, to allow zero-shot planning in new domains, we learn a conditional VAE model that generates images given a context describing the domain, and use these hallucinated samples for building the connectivity graph and planning. We show that this simple approach significantly outperform the SOTA VP methods, in terms of both plan interpretability and success rate when using the plan to guide a trajectory-following controller. Interestingly, our method can pick up non-trivial visual properties of objects, such as their geometry, and account for it in the plans.
Author Information
Kara Liu (UC Berkeley)
Thanard Kurutach (UC Berkeley)
Christine Tung (UC Berkeley)
Pieter Abbeel (UC Berkeley)
Aviv Tamar (Technion)
More from the Same Authors
-
2023 Poster: TGRL: Teacher Guided Reinforcement Learning Algorithm »
Idan Shenfeld · Zhang-Wei Hong · Aviv Tamar · Pulkit Agrawal -
2023 Poster: ContraBAR: Contrastive Bayes-Adaptive Deep RL »
Era Choshen · Aviv Tamar -
2023 Poster: Learning Control by Iterative Inversion »
Gal Leibovich · Guy Jacob · Or Avner · Gal Novik · Aviv Tamar -
2022 Poster: Unsupervised Image Representation Learning with Deep Latent Particles »
Tal Daniel · Aviv Tamar -
2022 Spotlight: Unsupervised Image Representation Learning with Deep Latent Particles »
Tal Daniel · Aviv Tamar -
2020 Poster: Sub-Goal Trees -- a Framework for Goal-Based Reinforcement Learning »
Tom Jurgenson · Or Avner · Edward Groshev · Aviv Tamar -
2019 Poster: Population Based Augmentation: Efficient Learning of Augmentation Policy Schedules »
Daniel Ho · Eric Liang · Peter Chen · Ion Stoica · Pieter Abbeel -
2019 Poster: Distributional Multivariate Policy Evaluation and Exploration with the Bellman GAN »
dror freirich · Tzahi Shimkin · Ron Meir · Aviv Tamar -
2019 Poster: A Deep Reinforcement Learning Perspective on Internet Congestion Control »
Nathan Jay · Noga H. Rotman · Brighten Godfrey · Michael Schapira · Aviv Tamar -
2019 Oral: Distributional Multivariate Policy Evaluation and Exploration with the Bellman GAN »
dror freirich · Tzahi Shimkin · Ron Meir · Aviv Tamar -
2019 Oral: A Deep Reinforcement Learning Perspective on Internet Congestion Control »
Nathan Jay · Noga H. Rotman · Brighten Godfrey · Michael Schapira · Aviv Tamar -
2019 Oral: Population Based Augmentation: Efficient Learning of Augmentation Policy Schedules »
Daniel Ho · Eric Liang · Peter Chen · Ion Stoica · Pieter Abbeel -
2017 Poster: Constrained Policy Optimization »
Joshua Achiam · David Held · Aviv Tamar · Pieter Abbeel -
2017 Talk: Constrained Policy Optimization »
Joshua Achiam · David Held · Aviv Tamar · Pieter Abbeel