Timezone: »
In the pursuit of increasingly intelligent learning systems, abstraction plays a vital role in enabling sophisticated decisions to be made in complex environments. The options framework provides formalism for such abstraction over sequences of decisions. However most models require that options be given a priori, presumably specified by hand, which is neither efficient, nor scalable. Indeed, it is preferable to learn options directly from interaction with the environment. Despite several efforts, this remains a difficult problem. In this work we develop a novel policy gradient method for the automatic learning of policies with options. This algorithm uses inference methods to simultaneously improve all of the options available to an agent, and thus can be employed in an off-policy manner, without observing option labels. The differentiable inference procedure employed yields options that can be easily interpreted. Empirical results confirm these attributes, and indicate that our algorithm has an improved sample efficiency relative to state-of-the-art in learning options end-to-end.
Author Information
Matthew Smith (McGill University)
Herke van Hoof (McGill University)
Joelle Pineau (McGill University / Facebook)
Related Events (a corresponding poster, oral, or spotlight)
-
2018 Oral: An Inference-Based Policy Gradient Method for Learning Options »
Wed. Jul 11th 01:20 -- 01:30 PM Room A1
More from the Same Authors
-
2023 : Fostering Women's Leadership in the Realm of Emerging Trends and Technologies »
Joelle Pineau · Rihab Gorsane · Pascale FUNG -
2023 : Joelle Pineau - A culture of open and reproducible research, in the era of large AI generative models »
Joelle Pineau -
2023 Panel: The Societal Impacts of AI »
Sanmi Koyejo · Samy Bengio · Ashia Wilson · Kirikowhai Mikaere · Joelle Pineau -
2021 Workshop: ICML 2021 Workshop on Unsupervised Reinforcement Learning »
Feryal Behbahani · Joelle Pineau · Lerrel Pinto · Roberta Raileanu · Aravind Srinivas · Denis Yarats · Amy Zhang -
2021 Poster: OptiDICE: Offline Policy Optimization via Stationary Distribution Correction Estimation »
Jongmin Lee · Wonseok Jeon · Byung-Jun Lee · Joelle Pineau · Kee-Eung Kim -
2021 Spotlight: OptiDICE: Offline Policy Optimization via Stationary Distribution Correction Estimation »
Jongmin Lee · Wonseok Jeon · Byung-Jun Lee · Joelle Pineau · Kee-Eung Kim -
2020 Workshop: MLRetrospectives: A Venue for Self-Reflection in ML Research »
Jessica Forde · Jesse Dodge · Mayoore Jaiswal · Rosanne Liu · Ryan Lowe · Rosanne Liu · Joelle Pineau · Yoshua Bengio -
2020 Poster: Online Learned Continual Compression with Adaptive Quantization Modules »
Lucas Caccia · Eugene Belilovsky · Massimo Caccia · Joelle Pineau -
2020 Poster: Constrained Markov Decision Processes via Backward Value Functions »
Harsh Satija · Philip Amortila · Joelle Pineau -
2020 Poster: Interference and Generalization in Temporal Difference Learning »
Emmanuel Bengio · Joelle Pineau · Doina Precup -
2020 Poster: Invariant Causal Prediction for Block MDPs »
Amy Zhang · Clare Lyle · Shagun Sodhani · Angelos Filos · Marta Kwiatkowska · Joelle Pineau · Yarin Gal · Doina Precup -
2019 Workshop: Generative Modeling and Model-Based Reasoning for Robotics and AI »
Aravind Rajeswaran · Emanuel Todorov · Igor Mordatch · William Agnew · Amy Zhang · Joelle Pineau · Michael Chang · Dumitru Erhan · Sergey Levine · Kimberly Stachenfeld · Marvin Zhang -
2019 Poster: Separable value functions across time-scales »
Joshua Romoff · Peter Henderson · Ahmed Touati · Yann Ollivier · Joelle Pineau · Emma Brunskill -
2019 Oral: Separable value functions across time-scales »
Joshua Romoff · Peter Henderson · Ahmed Touati · Yann Ollivier · Joelle Pineau · Emma Brunskill -
2018 Poster: Addressing Function Approximation Error in Actor-Critic Methods »
Scott Fujimoto · Herke van Hoof · David Meger -
2018 Poster: Focused Hierarchical RNNs for Conditional Sequence Processing »
Rosemary Nan Ke · Konrad Zolna · Alessandro Sordoni · Zhouhan Lin · Adam Trischler · Yoshua Bengio · Joelle Pineau · Laurent Charlin · Christopher Pal -
2018 Oral: Addressing Function Approximation Error in Actor-Critic Methods »
Scott Fujimoto · Herke van Hoof · David Meger -
2018 Oral: Focused Hierarchical RNNs for Conditional Sequence Processing »
Rosemary Nan Ke · Konrad Zolna · Alessandro Sordoni · Zhouhan Lin · Adam Trischler · Yoshua Bengio · Joelle Pineau · Laurent Charlin · Christopher Pal -
2017 Workshop: Reproducibility in Machine Learning Research »
Rosemary Nan Ke · Anirudh Goyal · Alex Lamb · Joelle Pineau · Samy Bengio · Yoshua Bengio -
2017 : Lifelong Learning - Panel Discussion »
Sergey Levine · Joelle Pineau · Balaraman Ravindran · Andrei A Rusu -
2017 : Joelle Pineau: A few modest insights from my lifelong learning »
Joelle Pineau