Timezone: »
Planning has been very successful for control tasks with known environment dynamics. To leverage planning in unknown environments, the agent needs to learn the dynamics from interactions with the world. However, learning dynamics models that are accurate enough for planning has been a long-standing challenge, especially in image-based domains. We propose the Deep Planning Network (PlaNet), a purely model-based agent that learns the environment dynamics from images and chooses actions through fast online planning in latent space. To achieve high performance, the dynamics model must accurately predict the rewards ahead for multiple time steps. We approach this using a latent dynamics model with both deterministic and stochastic transition components. Moreover, we propose a multi-step variational inference objective that we name latent overshooting. Using only pixel observations, our agent solves continuous control tasks with contact dynamics, partial observability, and sparse rewards, which exceed the difficulty of tasks that were previously solved by planning with learned models. PlaNet uses substantially fewer episodes and reaches final performance close to and sometimes higher than strong model-free algorithms.
Author Information
Danijar Hafner (Google Brain & University of Toronto)
Timothy Lillicrap (Google DeepMind)
Ian Fischer (Google)
Ruben Villegas (University of Michigan)
David Ha (Google)
Honglak Lee (Google / U. Michigan)
James Davidson (Google Brain)
Related Events (a corresponding poster, oral, or spotlight)
-
2019 Oral: Learning Latent Dynamics for Planning from Pixels »
Thu. Jun 13th 04:25 -- 04:30 PM Room Hall B
More from the Same Authors
-
2020 : Evaluating Agents without Rewards »
Danijar Hafner -
2021 : Discovering and Achieving Goals with World Models »
Russell Mendonca · Oleh Rybkin · Kostas Daniilidis · Danijar Hafner · Deepak Pathak -
2021 : Intrinsic Control of Variational Beliefs in Dynamic Partially-Observed Visual Environments »
Nicholas Rhinehart · Jenny Wang · Glen Berseth · John Co-Reyes · Danijar Hafner · Chelsea Finn · Sergey Levine -
2021 : Learning Action Translator for Meta Reinforcement Learning on Sparse-Reward Tasks »
Yijie Guo · Qiucheng Wu · Honglak Lee -
2022 : An Investigation into the Open World Survival Game Crafter »
Aleksandar Stanic · Yujin Tang · David Ha · Jürgen Schmidhuber -
2023 : Guide Your Agent with Adaptive Multimodal Rewards »
Changyeon Kim · Younggyo Seo · Hao Liu · Lisa Lee · Jinwoo Shin · Honglak Lee · Kimin Lee -
2023 : Learning Higher Order Skills that Efficiently Compose »
Anthony Liu · Dong Ki Kim · Sungryull Sohn · Honglak Lee -
2023 : Hierarchical Decomposition Framework for Feasibility-hard Combinatorial Optimization »
Hanbum Ko · Minu Kim · Han-Seul Jeong · Sunghoon Hong · Deunsol Yoon · Youngjoon Park · Woohyung Lim · Honglak Lee · Moontae Lee · Kanghoon Lee · Sungbin Lim · Sungryull Sohn -
2023 : Mixed-Curvature Transformers for Graph Representation Learning »
Sungjun Cho · Seunghyuk Cho · Sungwoo Park · Hankook Lee · Honglak Lee · Moontae Lee -
2023 Poster: Go Beyond Imagination: Maximizing Episodic Reachability with World Models »
Yao Fu · Run Peng · Honglak Lee -
2023 Poster: Temporally Consistent Transformers for Video Generation »
Wilson Yan · Danijar Hafner · Stephen James · Pieter Abbeel -
2022 Poster: Retrieval-Augmented Reinforcement Learning »
Anirudh Goyal · Abe Friesen Friesen · Andrea Banino · Theophane Weber · Nan Rosemary Ke · Adrià Puigdomenech Badia · Arthur Guez · Mehdi Mirza · Peter Humphreys · Ksenia Konyushkova · Michal Valko · Simon Osindero · Timothy Lillicrap · Nicolas Heess · Charles Blundell -
2022 Spotlight: Retrieval-Augmented Reinforcement Learning »
Anirudh Goyal · Abe Friesen Friesen · Andrea Banino · Theophane Weber · Nan Rosemary Ke · Adrià Puigdomenech Badia · Arthur Guez · Mehdi Mirza · Peter Humphreys · Ksenia Konyushkova · Michal Valko · Simon Osindero · Timothy Lillicrap · Nicolas Heess · Charles Blundell -
2022 Poster: A data-driven approach for learning to control computers »
Peter Humphreys · David Raposo · Tobias Pohlen · Gregory Thornton · Rachita Chhaparia · Alistair Muldal · Josh Abramson · Petko Georgiev · Adam Santoro · Timothy Lillicrap -
2022 Spotlight: A data-driven approach for learning to control computers »
Peter Humphreys · David Raposo · Tobias Pohlen · Gregory Thornton · Rachita Chhaparia · Alistair Muldal · Josh Abramson · Petko Georgiev · Adam Santoro · Timothy Lillicrap -
2021 : Panel Discussion »
Rosemary Nan Ke · Danijar Hafner · Pieter Abbeel · Chelsea Finn · Chelsea Finn -
2021 : Invited Talk by Danijar Hafner »
Danijar Hafner -
2021 : Invited Talk by David Ha »
David Ha -
2021 Poster: Learning to Weight Imperfect Demonstrations »
Yunke Wang · Chang Xu · Bo Du · Honglak Lee -
2021 Poster: Shortest-Path Constrained Reinforcement Learning for Sparse Reward Tasks »
Sungryull Sohn · Sungtae Lee · Jongwook Choi · Harm van Seijen · Mehdi Fatemi · Honglak Lee -
2021 Poster: Variational Empowerment as Representation Learning for Goal-Conditioned Reinforcement Learning »
Jongwook Choi · Archit Sharma · Honglak Lee · Sergey Levine · Shixiang Gu -
2021 Spotlight: Variational Empowerment as Representation Learning for Goal-Conditioned Reinforcement Learning »
Jongwook Choi · Archit Sharma · Honglak Lee · Sergey Levine · Shixiang Gu -
2021 Spotlight: Learning to Weight Imperfect Demonstrations »
Yunke Wang · Chang Xu · Bo Du · Honglak Lee -
2021 Spotlight: Shortest-Path Constrained Reinforcement Learning for Sparse Reward Tasks »
Sungryull Sohn · Sungtae Lee · Jongwook Choi · Harm van Seijen · Mehdi Fatemi · Honglak Lee -
2021 Poster: State Entropy Maximization with Random Encoders for Efficient Exploration »
Younggyo Seo · Lili Chen · Jinwoo Shin · Honglak Lee · Pieter Abbeel · Kimin Lee -
2021 Spotlight: State Entropy Maximization with Random Encoders for Efficient Exploration »
Younggyo Seo · Lili Chen · Jinwoo Shin · Honglak Lee · Pieter Abbeel · Kimin Lee -
2020 Poster: Context-aware Dynamics Model for Generalization in Model-Based Reinforcement Learning »
Kimin Lee · Younggyo Seo · Seunghyun Lee · Honglak Lee · Jinwoo Shin -
2020 Poster: Planning to Explore via Self-Supervised World Models »
Ramanan Sekar · Oleh Rybkin · Kostas Daniilidis · Pieter Abbeel · Danijar Hafner · Deepak Pathak -
2019 Poster: Meta-Learning Neural Bloom Filters »
Jack Rae · Sergey Bartunov · Timothy Lillicrap -
2019 Poster: Robust Inference via Generative Classifiers for Handling Noisy Labels »
Kimin Lee · Sukmin Yun · Kibok Lee · Honglak Lee · Bo Li · Jinwoo Shin -
2019 Poster: Similarity of Neural Network Representations Revisited »
Simon Kornblith · Mohammad Norouzi · Honglak Lee · Geoffrey Hinton -
2019 Oral: Similarity of Neural Network Representations Revisited »
Simon Kornblith · Mohammad Norouzi · Honglak Lee · Geoffrey Hinton -
2019 Oral: Robust Inference via Generative Classifiers for Handling Noisy Labels »
Kimin Lee · Sukmin Yun · Kibok Lee · Honglak Lee · Bo Li · Jinwoo Shin -
2019 Oral: Meta-Learning Neural Bloom Filters »
Jack Rae · Sergey Bartunov · Timothy Lillicrap -
2019 Poster: Deep Compressed Sensing »
Yan Wu · Mihaela Rosca · Timothy Lillicrap -
2019 Oral: Deep Compressed Sensing »
Yan Wu · Mihaela Rosca · Timothy Lillicrap -
2019 Poster: Composing Entropic Policies using Divergence Correction »
Jonathan Hunt · Andre Barreto · Timothy Lillicrap · Nicolas Heess -
2019 Poster: An Investigation of Model-Free Planning »
Arthur Guez · Mehdi Mirza · Karol Gregor · Rishabh Kabra · Sebastien Racaniere · Theophane Weber · David Raposo · Adam Santoro · Laurent Orseau · Tom Eccles · Greg Wayne · David Silver · Timothy Lillicrap -
2019 Oral: An Investigation of Model-Free Planning »
Arthur Guez · Mehdi Mirza · Karol Gregor · Rishabh Kabra · Sebastien Racaniere · Theophane Weber · David Raposo · Adam Santoro · Laurent Orseau · Tom Eccles · Greg Wayne · David Silver · Timothy Lillicrap -
2019 Oral: Composing Entropic Policies using Divergence Correction »
Jonathan Hunt · Andre Barreto · Timothy Lillicrap · Nicolas Heess -
2018 Poster: Self-Imitation Learning »
Junhyuk Oh · Yijie Guo · Satinder Singh · Honglak Lee -
2018 Oral: Self-Imitation Learning »
Junhyuk Oh · Yijie Guo · Satinder Singh · Honglak Lee -
2018 Poster: Measuring abstract reasoning in neural networks »
Adam Santoro · Feilx Hill · David GT Barrett · Ari S Morcos · Timothy Lillicrap -
2018 Oral: Measuring abstract reasoning in neural networks »
Adam Santoro · Feilx Hill · David GT Barrett · Ari S Morcos · Timothy Lillicrap -
2018 Poster: Fast Parametric Learning with Activation Memorization »
Jack Rae · Chris Dyer · Peter Dayan · Timothy Lillicrap -
2018 Poster: Hierarchical Long-term Video Prediction without Supervision »
Nevan Wichers · Ruben Villegas · Dumitru Erhan · Honglak Lee -
2018 Poster: Fixing a Broken ELBO »
Alexander Alemi · Ben Poole · Ian Fischer · Joshua V Dillon · Rif Saurous · Kevin Murphy -
2018 Oral: Fast Parametric Learning with Activation Memorization »
Jack Rae · Chris Dyer · Peter Dayan · Timothy Lillicrap -
2018 Oral: Hierarchical Long-term Video Prediction without Supervision »
Nevan Wichers · Ruben Villegas · Dumitru Erhan · Honglak Lee -
2018 Oral: Fixing a Broken ELBO »
Alexander Alemi · Ben Poole · Ian Fischer · Joshua V Dillon · Rif Saurous · Kevin Murphy -
2017 Poster: Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning »
Junhyuk Oh · Satinder Singh · Honglak Lee · Pushmeet Kohli -
2017 Talk: Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning »
Junhyuk Oh · Satinder Singh · Honglak Lee · Pushmeet Kohli -
2017 Poster: Learning to Generate Long-term Future via Hierarchical Prediction »
Ruben Villegas · Jimei Yang · Yuliang Zou · Sungryull Sohn · Xunyu Lin · Honglak Lee -
2017 Poster: Learning to Learn without Gradient Descent by Gradient Descent »
Yutian Chen · Matthew Hoffman · Sergio Gómez Colmenarejo · Misha Denil · Timothy Lillicrap · Matthew Botvinick · Nando de Freitas -
2017 Talk: Learning to Generate Long-term Future via Hierarchical Prediction »
Ruben Villegas · Jimei Yang · Yuliang Zou · Sungryull Sohn · Xunyu Lin · Honglak Lee -
2017 Poster: Robust Adversarial Reinforcement Learning »
Lerrel Pinto · James Davidson · Rahul Sukthankar · Abhinav Gupta -
2017 Talk: Learning to Learn without Gradient Descent by Gradient Descent »
Yutian Chen · Matthew Hoffman · Sergio Gómez Colmenarejo · Misha Denil · Timothy Lillicrap · Matthew Botvinick · Nando de Freitas -
2017 Talk: Robust Adversarial Reinforcement Learning »
Lerrel Pinto · James Davidson · Rahul Sukthankar · Abhinav Gupta