Timezone: »
Skills or low-level policies in reinforcement learning are temporally extended actions that can speed up learning and enable complex behaviours. Recent work in offline reinforcement learning and imitation learning has proposed several techniques for skill discovery from a set of expert trajectories. While these methods are promising, the number K of skills to discover is always a fixed hyperparameter, which requires either prior knowledge about the environment or an additional parameter search to tune it. We first propose a method for offline learning of options (a particular skill framework) exploiting advances in variational inference and continuous relaxations. We then highlight an unexplored connection between Bayesian nonparametrics and offline skill discovery, and show how to obtain a nonparametric version of our model. This version is tractable thanks to a carefully structured approximate posterior with a dynamically-changing number of options, removing the need to specify K. We also show how our nonparametric extension can be applied in other skill frameworks, and empirically demonstrate that our method can outperform state-of-the-art offline skill learning algorithms across a variety of environments.
Author Information
Valentin Villecroze (Layer 6)
Harry Braviner (Layer 6 AI)
Panteha Naderian (Layer 6 AI)
Chris Maddison (University of Toronto)
Gabriel Loaiza-Ganem (Layer 6 AI)
Related Events (a corresponding poster, oral, or spotlight)
-
2022 Poster: Bayesian Nonparametrics for Offline Skill Discovery »
Wed. Jul 20th through Thu the 21st Room Hall E #900
More from the Same Authors
-
2022 : Contrastive Learning Can Find An Optimal Basis For Approximately Invariant Functions »
Daniel D. Johnson · Daniel D. Johnson · Ayoub El Hanchi · Ayoub El Hanchi · Chris Maddison · Chris Maddison -
2023 Poster: TR0N: Translator Networks for 0-Shot Plug-and-Play Conditional Generation »
Zhaoyan Liu · Noël Vouitsis · Satya Krishna Gorti · Jimmy Ba · Gabriel Loaiza-Ganem -
2022 : Neural Implicit Manifold Learning for Topology-Aware Generative Modelling »
Brendan Ross · Gabriel Loaiza-Ganem · Anthony Caterini · Jesse Cresswell -
2022 Poster: Augment with Care: Contrastive Learning for Combinatorial Problems »
Haonan Duan · Pashootan Vaezipoor · Max Paulus · Yangjun Ruan · Chris Maddison -
2022 Poster: Learning to Cut by Looking Ahead: Cutting Plane Selection via Imitation Learning »
Max Paulus · Giulia Zarpellon · Andreas Krause · Laurent Charlin · Chris Maddison -
2022 Spotlight: Augment with Care: Contrastive Learning for Combinatorial Problems »
Haonan Duan · Pashootan Vaezipoor · Max Paulus · Yangjun Ruan · Chris Maddison -
2022 Spotlight: Learning to Cut by Looking Ahead: Cutting Plane Selection via Imitation Learning »
Max Paulus · Giulia Zarpellon · Andreas Krause · Laurent Charlin · Chris Maddison -
2022 Poster: Stochastic Reweighted Gradient Descent »
Ayoub El Hanchi · David Stephens · Chris Maddison -
2022 Spotlight: Stochastic Reweighted Gradient Descent »
Ayoub El Hanchi · David Stephens · Chris Maddison -
2021 Poster: Improving Lossless Compression Rates via Monte Carlo Bits-Back Coding »
Yangjun Ruan · Karen Ullrich · Daniel Severo · James Townsend · Ashish Khisti · Arnaud Doucet · Alireza Makhzani · Chris Maddison -
2021 Oral: Improving Lossless Compression Rates via Monte Carlo Bits-Back Coding »
Yangjun Ruan · Karen Ullrich · Daniel Severo · James Townsend · Ashish Khisti · Arnaud Doucet · Alireza Makhzani · Chris Maddison -
2021 Poster: Oops I Took A Gradient: Scalable Sampling for Discrete Distributions »
Will Grathwohl · Kevin Swersky · Milad Hashemi · David Duvenaud · Chris Maddison -
2021 Oral: Oops I Took A Gradient: Scalable Sampling for Discrete Distributions »
Will Grathwohl · Kevin Swersky · Milad Hashemi · David Duvenaud · Chris Maddison -
2020 : Q&A: Chris Maddison »
Chris Maddison · Jessica Forde · Jesse Dodge -
2020 : Invited Talk: Chris Maddison »
Chris Maddison -
2020 Poster: The continuous categorical: a novel simplex-valued exponential family »
Elliott Gordon-Rodriguez · Gabriel Loaiza-Ganem · John Cunningham