Timezone: »
Offline reinforcement learning leverages large datasets to train policies without interactions with the environment. The learned policies may then be deployed in real-world settings where interactions are costly or dangerous. Current algorithms over-fit to the training dataset and as a consequence perform poorly when deployed to out-of-distribution generalizations of the environment. We aim to address these limitations by learning a Koopman latent representation which allows us to infer symmetries of the system's underlying dynamic. The latter is then utilized to extend the otherwise static offline dataset during training; this constitutes a novel data augmentation framework which reflects the system's dynamic and is thus to be interpreted as an exploration of the environments phase space. To obtain the symmetries we employ Koopman theory in which nonlinear dynamics are represented in terms of a linear operator acting on the space of measurement functions of the system. We provide novel theoretical results on the existence and nature of symmetries relevant for control systems such as reinforcement learning settings. Moreover, we empirically evaluate our method on several benchmark offline reinforcement learning tasks and datasets including D4RL, Metaworld and Robosuite and find that by using our framework we consistently improve the state-of-the-art of model-free Q-learning methods.
Author Information
Matthias Weissenbacher (RIKEN Center for Advanced Intelligence Project (AIP))
Samrath Sinha (University of Toronto)
Animesh Garg (University of Toronto, Vector Institute, Nvidia)
Yoshinobu Kawahara (Kyushu University / RIKEN)
Related Events (a corresponding poster, oral, or spotlight)
-
2022 Spotlight: Koopman Q-learning: Offline Reinforcement Learning via Symmetries of Dynamics »
Thu. Jul 21st 06:50 -- 06:55 PM Room Room 318 - 320
More from the Same Authors
-
2021 : Auditing AI models for Verified Deployment under Semantic Specifications »
Homanga Bharadhwaj · De-An Huang · Chaowei Xiao · Anima Anandkumar · Animesh Garg -
2021 : Optimistic Exploration with Backward Bootstrapped Bonus for Deep Reinforcement Learning »
Chenjia Bai · Lingxiao Wang · Lei Han · Jianye Hao · Animesh Garg · Peng Liu · Zhaoran Wang -
2021 : Convergence and Optimality of Policy Gradient Methods in Weakly Smooth Settings »
Shunshi Zhang · Murat Erdogdu · Animesh Garg -
2021 : Learning by Watching: Physical Imitation of Manipulation Skills from Human Videos »
Haoyu Xiong · Yun-Chun Chen · Homanga Bharadhwaj · Samrath Sinha · Animesh Garg -
2022 : VIPer: Iterative Value-Aware Model Learning on the Value Improvement Path »
Romina Abachi · Claas Voelcker · Animesh Garg · Amir-massoud Farahmand -
2022 : MoCoDA: Model-based Counterfactual Data Augmentation »
Silviu Pitis · Elliot Creager · Ajay Mandlekar · Animesh Garg -
2021 Poster: Principled Exploration via Optimistic Bootstrapping and Backward Induction »
Chenjia Bai · Lingxiao Wang · Lei Han · Jianye Hao · Animesh Garg · Peng Liu · Zhaoran Wang -
2021 Poster: Value Iteration in Continuous Actions, States and Time »
Michael Lutter · Shie Mannor · Jan Peters · Dieter Fox · Animesh Garg -
2021 Spotlight: Value Iteration in Continuous Actions, States and Time »
Michael Lutter · Shie Mannor · Jan Peters · Dieter Fox · Animesh Garg -
2021 Spotlight: Principled Exploration via Optimistic Bootstrapping and Backward Induction »
Chenjia Bai · Lingxiao Wang · Lei Han · Jianye Hao · Animesh Garg · Peng Liu · Zhaoran Wang -
2021 Poster: Tesseract: Tensorised Actors for Multi-Agent Reinforcement Learning »
Anuj Mahajan · Mikayel Samvelyan · Lei Mao · Viktor Makoviychuk · Animesh Garg · Jean Kossaifi · Shimon Whiteson · Yuke Zhu · Anima Anandkumar -
2021 Poster: Coach-Player Multi-agent Reinforcement Learning for Dynamic Team Composition »
Bo Liu · Qiang Liu · Peter Stone · Animesh Garg · Yuke Zhu · Anima Anandkumar -
2021 Spotlight: Tesseract: Tensorised Actors for Multi-Agent Reinforcement Learning »
Anuj Mahajan · Mikayel Samvelyan · Lei Mao · Viktor Makoviychuk · Animesh Garg · Jean Kossaifi · Shimon Whiteson · Yuke Zhu · Anima Anandkumar -
2021 Oral: Coach-Player Multi-agent Reinforcement Learning for Dynamic Team Composition »
Bo Liu · Qiang Liu · Peter Stone · Animesh Garg · Yuke Zhu · Anima Anandkumar -
2020 Poster: Revisiting Training Strategies and Generalization Performance in Deep Metric Learning »
Karsten Roth · Timo Milbich · Samrath Sinha · Prateek Gupta · Bjorn Ommer · Joseph Paul Cohen -
2020 Poster: Semi-Supervised StyleGAN for Disentanglement Learning »
Weili Nie · Tero Karras · Animesh Garg · Shoubhik Debnath · Anjul Patney · Ankit Patel · Anima Anandkumar -
2020 Poster: Small-GAN: Speeding up GAN Training using Core-Sets »
Samrath Sinha · Han Zhang · Anirudh Goyal · Yoshua Bengio · Hugo Larochelle · Augustus Odena -
2020 Poster: Angular Visual Hardness »
Beidi Chen · Weiyang Liu · Zhiding Yu · Jan Kautz · Anshumali Shrivastava · Animesh Garg · Anima Anandkumar