Timezone: »
A key component of model-based reinforcement learning (RL) is a dynamics model that predicts the outcomes of actions. Errors in this predictive model can degrade the performance of model-based controllers, and complex Markov decision processes (MDPs) can present exceptionally difficult prediction problems. To mitigate this issue, we propose predictable MDP abstraction (PMA): instead of training a predictive model on the original MDP, we train a model on a transformed MDP with a learned action space that only permits predictable, easy-to-model actions, while covering the original state-action space as much as possible. As a result, model learning becomes easier and more accurate, which allows robust, stable model-based planning or model-based RL. This transformation is learned in an unsupervised manner, before any task is specified by the user. Downstream tasks can then be solved with model-based control in a zero-shot fashion, without additional environment interactions. We theoretically analyze PMA and empirically demonstrate that PMA leads to significant improvements over prior unsupervised model-based RL approaches in a range of benchmark environments. Our code and videos are available at https://seohong.me/projects/pma/
Author Information
Seohong Park (University of California, Berkeley)
Sergey Levine (University of Washington)
More from the Same Authors
-
2021 : Reinforcement Learning as One Big Sequence Modeling Problem »
Michael Janner · Qiyang Li · Sergey Levine -
2021 : Intrinsic Control of Variational Beliefs in Dynamic Partially-Observed Visual Environments »
Nicholas Rhinehart · Jenny Wang · Glen Berseth · John Co-Reyes · Danijar Hafner · Chelsea Finn · Sergey Levine -
2021 : Explore and Control with Adversarial Surprise »
Arnaud Fickinger · Natasha Jaques · Samyak Parajuli · Michael Chang · Nicholas Rhinehart · Glen Berseth · Stuart Russell · Sergey Levine -
2022 : Effective Offline RL Needs Going Beyond Pessimism: Representations and Distributional Shift »
Xinyang Geng · Kevin Li · Abhishek Gupta · Aviral Kumar · Sergey Levine -
2022 : DASCO: Dual-Generator Adversarial Support Constrained Offline Reinforcement Learning »
Quan Vuong · Aviral Kumar · Sergey Levine · Yevgen Chebotar -
2022 : Distributionally Adaptive Meta Reinforcement Learning »
Anurag Ajay · Dibya Ghosh · Sergey Levine · Pulkit Agrawal · Abhishek Gupta -
2022 : You Only Live Once: Single-Life Reinforcement Learning via Learned Reward Shaping »
Annie Chen · Archit Sharma · Sergey Levine · Chelsea Finn -
2022 : Multimodal Masked Autoencoders Learn Transferable Representations »
Xinyang Geng · Hao Liu · Lisa Lee · Dale Schuurmans · Sergey Levine · Pieter Abbeel -
2023 : Deep Neural Networks Extrapolate Cautiously (Most of the Time) »
Katie Kang · Amrith Setlur · Claire Tomlin · Sergey Levine -
2023 : Offline Goal-Conditioned RL with Latent States as Actions »
Seohong Park · Dibya Ghosh · Benjamin Eysenbach · Sergey Levine -
2023 : Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware »
Tony Zhao · Vikash Kumar · Sergey Levine · Chelsea Finn -
2023 : Training Diffusion Models with Reinforcement Learning »
Kevin Black · Michael Janner · Yilun Du · Ilya Kostrikov · Sergey Levine -
2023 : Training Diffusion Models with Reinforcement Learning »
Kevin Black · Michael Janner · Yilun Du · Ilya Kostrikov · Sergey Levine -
2023 : Video-Guided Skill Discovery »
Manan Tomar · Dibya Ghosh · Vivek Myers · Anca Dragan · Matthew Taylor · Philip Bachman · Sergey Levine -
2023 : Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-Tuning »
Mitsuhiko Nakamoto · Yuexiang Zhai · Anikait Singh · Max Sobol Mark · Yi Ma · Chelsea Finn · Aviral Kumar · Sergey Levine -
2023 : Training Diffusion Models with Reinforcement Learning »
Kevin Black · Michael Janner · Yilun Du · Ilya Kostrikov · Sergey Levine -
2023 Poster: Jump-Start Reinforcement Learning »
Ikechukwu Uchendu · Ted Xiao · Yao Lu · Banghua Zhu · Mengyuan Yan · Joséphine Simon · Matthew Bennice · Chuyuan Fu · Cong Ma · Jiantao Jiao · Sergey Levine · Karol Hausman -
2023 Poster: A Connection between One-Step RL and Critic Regularization in Reinforcement Learning »
Benjamin Eysenbach · Matthieu Geist · Sergey Levine · Ruslan Salakhutdinov -
2023 Poster: Adversarial Policies Beat Superhuman Go AIs »
Tony Wang · Adam Gleave · Tom Tseng · Kellin Pelrine · Nora Belrose · Joseph Miller · Michael Dennis · Yawen Duan · Viktor Pogrebniak · Sergey Levine · Stuart Russell -
2023 Oral: Adversarial Policies Beat Superhuman Go AIs »
Tony Wang · Adam Gleave · Tom Tseng · Kellin Pelrine · Nora Belrose · Joseph Miller · Michael Dennis · Yawen Duan · Viktor Pogrebniak · Sergey Levine · Stuart Russell -
2023 Poster: Reinforcement Learning from Passive Data via Latent Intentions »
Dibya Ghosh · Chethan Bhateja · Sergey Levine -
2023 Poster: Controllability-Aware Unsupervised Skill Discovery »
Seohong Park · Kimin Lee · Youngwoon Lee · Pieter Abbeel -
2023 Poster: Understanding the Complexity Gains of Single-Task RL with a Curriculum »
Qiyang Li · Yuexiang Zhai · Yi Ma · Sergey Levine -
2023 Oral: Reinforcement Learning from Passive Data via Latent Intentions »
Dibya Ghosh · Chethan Bhateja · Sergey Levine -
2023 Poster: PaLM-E: An Embodied Multimodal Language Model »
Danny Driess · Fei Xia · Mehdi S. M. Sajjadi · Corey Lynch · Aakanksha Chowdhery · Brian Ichter · Ayzaan Wahid · Jonathan Tompson · Quan Vuong · Tianhe (Kevin) Yu · Wenlong Huang · Yevgen Chebotar · Pierre Sermanet · Daniel Duckworth · Sergey Levine · Vincent Vanhoucke · Karol Hausman · Marc Toussaint · Klaus Greff · Andy Zeng · Igor Mordatch · Pete Florence -
2023 Poster: Efficient Online Reinforcement Learning with Offline Data »
Philip Ball · Laura Smith · Ilya Kostrikov · Sergey Levine -
2022 : Multimodal Masked Autoencoders Learn Transferable Representations »
Xinyang Geng · Hao Liu · Lisa Lee · Dale Schuurmans · Sergey Levine · Pieter Abbeel