Timezone: »
Poster
ContraBAR: Contrastive Bayes-Adaptive Deep RL
Era Choshen · Aviv Tamar
In meta reinforcement learning (meta RL), an agent seeks a Bayes-optimal policy -- the optimal policy when facing an unknown task that is sampled from some known task distribution. Previous approaches tackled this problem by inferring a $\textit{belief}$ over task parameters, using variational inference methods. Motivated by recent successes of contrastive learning approaches in RL, such as contrastive predictive coding (CPC), we investigate whether contrastive methods can be used for learning Bayes-optimal behavior. We begin by proving that representations learned by CPC are indeed sufficient for Bayes optimality. Based on this observation, we propose a simple meta RL algorithm that uses CPC in lieu of variational belief inference. Our method, $\textit{ContraBAR}$, achieves comparable performance to state-of-the-art in domains with state-based observation and circumvents the computational toll of future observation reconstruction, enabling learning in domains with image-based observations. It can also be combined with image augmentations for domain randomization and used seamlessly in both online and offline meta RL settings.
Author Information
Era Choshen (Technion)
Aviv Tamar (Technion)
More from the Same Authors
-
2023 Poster: Learning Control by Iterative Inversion »
Gal Leibovich · Guy Jacob · Or Avner · Gal Novik · Aviv Tamar -
2023 Poster: TGRL: An Algorithm for Teacher Guided Reinforcement Learning »
Idan Shenfeld · Zhang-Wei Hong · Aviv Tamar · Pulkit Agrawal -
2022 Poster: Unsupervised Image Representation Learning with Deep Latent Particles »
Tal Daniel · Aviv Tamar -
2022 Spotlight: Unsupervised Image Representation Learning with Deep Latent Particles »
Tal Daniel · Aviv Tamar -
2020 Poster: Hallucinative Topological Memory for Zero-Shot Visual Planning »
Kara Liu · Thanard Kurutach · Christine Tung · Pieter Abbeel · Aviv Tamar -
2020 Poster: Sub-Goal Trees -- a Framework for Goal-Based Reinforcement Learning »
Tom Jurgenson · Or Avner · Edward Groshev · Aviv Tamar -
2019 Poster: Distributional Multivariate Policy Evaluation and Exploration with the Bellman GAN »
dror freirich · Tzahi Shimkin · Ron Meir · Aviv Tamar -
2019 Poster: A Deep Reinforcement Learning Perspective on Internet Congestion Control »
Nathan Jay · Noga H. Rotman · Brighten Godfrey · Michael Schapira · Aviv Tamar -
2019 Oral: Distributional Multivariate Policy Evaluation and Exploration with the Bellman GAN »
dror freirich · Tzahi Shimkin · Ron Meir · Aviv Tamar -
2019 Oral: A Deep Reinforcement Learning Perspective on Internet Congestion Control »
Nathan Jay · Noga H. Rotman · Brighten Godfrey · Michael Schapira · Aviv Tamar -
2017 Poster: Constrained Policy Optimization »
Joshua Achiam · David Held · Aviv Tamar · Pieter Abbeel -
2017 Talk: Constrained Policy Optimization »
Joshua Achiam · David Held · Aviv Tamar · Pieter Abbeel