Timezone: »
We propose iterative inversion - an algorithm for learning an inverse function without input-output pairs, but only with samples from the desired output distribution and access to the forward function. The key challenge is a distribution shift between the desired outputs and the outputs of an initial random guess, and we prove that iterative inversion can steer the learning correctly, under rather strict conditions on the function. We apply iterative inversion to learn control. Our input is a set of demonstrations of desired behavior, given as video embeddings of trajectories (without actions), and our method iteratively learns to imitate trajectories generated by the current policy, perturbed by random exploration noise. Our approach does not require rewards, and only employs supervised learning, which can be easily scaled to use state-of-the-art trajectory embedding techniques and policy representations. Indeed, with a VQ-VAE embedding, and a transformer-based policy, we demonstrate non-trivial continuous control on several tasks (videos available at https://sites.google.com/view/iter-inver). Further, we report an improved performance on imitating diverse behaviors compared to reward based methods.
Author Information
Gal Leibovich (Intel Corporation)
Guy Jacob (Intel Labs)
Or Avner (Technion)
Gal Novik (Intel Corporation)
Aviv Tamar (Technion)
More from the Same Authors
-
2023 Poster: From Temporal to Contemporaneous Iterative Causal Discovery in the Presence of Latent Confounders »
Raanan Yehezkel Rohekar · Shami Nisimov · Yaniv Gurwicz · Gal Novik -
2023 Poster: ContraBAR: Contrastive Bayes-Adaptive Deep RL »
Era Choshen · Aviv Tamar -
2023 Poster: TGRL: An Algorithm for Teacher Guided Reinforcement Learning »
Idan Shenfeld · Zhang-Wei Hong · Aviv Tamar · Pulkit Agrawal -
2022 Poster: Unsupervised Image Representation Learning with Deep Latent Particles »
Tal Daniel · Aviv Tamar -
2022 Spotlight: Unsupervised Image Representation Learning with Deep Latent Particles »
Tal Daniel · Aviv Tamar -
2020 Poster: Hallucinative Topological Memory for Zero-Shot Visual Planning »
Kara Liu · Thanard Kurutach · Christine Tung · Pieter Abbeel · Aviv Tamar -
2020 Poster: Sub-Goal Trees -- a Framework for Goal-Based Reinforcement Learning »
Tom Jurgenson · Or Avner · Edward Groshev · Aviv Tamar -
2019 Poster: Distributional Multivariate Policy Evaluation and Exploration with the Bellman GAN »
dror freirich · Tzahi Shimkin · Ron Meir · Aviv Tamar -
2019 Poster: A Deep Reinforcement Learning Perspective on Internet Congestion Control »
Nathan Jay · Noga H. Rotman · Brighten Godfrey · Michael Schapira · Aviv Tamar -
2019 Oral: Distributional Multivariate Policy Evaluation and Exploration with the Bellman GAN »
dror freirich · Tzahi Shimkin · Ron Meir · Aviv Tamar -
2019 Oral: A Deep Reinforcement Learning Perspective on Internet Congestion Control »
Nathan Jay · Noga H. Rotman · Brighten Godfrey · Michael Schapira · Aviv Tamar -
2017 Poster: Constrained Policy Optimization »
Joshua Achiam · David Held · Aviv Tamar · Pieter Abbeel -
2017 Talk: Constrained Policy Optimization »
Joshua Achiam · David Held · Aviv Tamar · Pieter Abbeel