Timezone: »
In recent years, a growing number of deep model-based reinforcement learning (RL) methods have been introduced. The interest in deep model-based RL is not surprising, given its many potential benefits, such as higher sample efficiency and the potential for fast adaption to changes in the environment. However, we demonstrate, using an improved version of the recently introduced Local Change Adaptation (LoCA) setup, that well-known model-based methods such as PlaNet and DreamerV2 perform poorly in their ability to adapt to local environmental changes. Combined with prior work that made a similar observation about the other popular model-based method, MuZero, a trend appears to emerge, suggesting that current deep model-based methods have serious limitations. We dive deeper into the causes of this poor performance, by identifying elements that hurt adaptive behavior and linking these to underlying techniques frequently used in deep model-based RL. We empirically validate these insights in the case of linear function approximation by demonstrating that a modified version of linear Dyna achieves effective adaptation to local changes. Furthermore, we provide detailed insights into the challenges of building an adaptive nonlinear model-based method, by experimenting with a nonlinear version of Dyna.
Author Information
Yi Wan (University of Alberta)
Ali Rahimi-Kalahroudi (MILA - Université de Montréal)
Janarthanan Rajendran (Mila, University of Montreal)
Ida Momennejad (Microsoft Research)
Sarath Chandar (Mila / École Polytechnique de Montréal)
Harm van Seijen (Microsoft Research)
Related Events (a corresponding poster, oral, or spotlight)
-
2022 Spotlight: Towards Evaluating Adaptivity of Model-Based Reinforcement Learning Methods »
Wed. Jul 20th 09:35 -- 09:40 PM Room Hall G
More from the Same Authors
-
2022 : Interaction-Grounded Learning with Action-inclusive Feedback »
Tengyang Xie · Akanksha Saran · Dylan Foster · Lekan Molu · Ida Momennejad · Nan Jiang · Paul Mineiro · John Langford -
2022 Social: Designing an RL system toward AGI »
Yi Wan · Alex Ayoub -
2021 Workshop: Workshop on Computational Approaches to Mental Health @ ICML 2021 »
Niranjani Prasad · Caroline Weis · Shems Saleh · Rosanne Liu · Jake Vasilakes · Agni Kumar · Tianlin Zhang · Ida Momennejad · Danielle Belgrave -
2021 Poster: Average-Reward Off-Policy Policy Evaluation with Function Approximation »
Shangtong Zhang · Yi Wan · Richard Sutton · Shimon Whiteson -
2021 Spotlight: Average-Reward Off-Policy Policy Evaluation with Function Approximation »
Shangtong Zhang · Yi Wan · Richard Sutton · Shimon Whiteson -
2021 Poster: Interaction-Grounded Learning »
Tengyang Xie · John Langford · Paul Mineiro · Ida Momennejad -
2021 Spotlight: Interaction-Grounded Learning »
Tengyang Xie · John Langford · Paul Mineiro · Ida Momennejad -
2021 Poster: Shortest-Path Constrained Reinforcement Learning for Sparse Reward Tasks »
Sungryull Sohn · Sungtae Lee · Jongwook Choi · Harm van Seijen · Mehdi Fatemi · Honglak Lee -
2021 Poster: Learning and Planning in Average-Reward Markov Decision Processes »
Yi Wan · Abhishek Naik · Richard Sutton -
2021 Spotlight: Learning and Planning in Average-Reward Markov Decision Processes »
Yi Wan · Abhishek Naik · Richard Sutton -
2021 Spotlight: Shortest-Path Constrained Reinforcement Learning for Sparse Reward Tasks »
Sungryull Sohn · Sungtae Lee · Jongwook Choi · Harm van Seijen · Mehdi Fatemi · Honglak Lee -
2021 Social: Continuing (Non-episodic) RL Problems »
Yi Wan -
2021 Poster: Navigation Turing Test (NTT): Learning to Evaluate Human-Like Navigation »
Sam Devlin · Raluca Georgescu · Ida Momennejad · Jaroslaw Rzepecki · Evelyn Zuniga · Gavin Costello · Guy Leroy · Ali Shaw · Katja Hofmann -
2021 Spotlight: Navigation Turing Test (NTT): Learning to Evaluate Human-Like Navigation »
Sam Devlin · Raluca Georgescu · Ida Momennejad · Jaroslaw Rzepecki · Evelyn Zuniga · Gavin Costello · Guy Leroy · Ali Shaw · Katja Hofmann -
2020 : Concluding Remarks »
Sarath Chandar · Shagun Sodhani -
2020 : Panel Discussion »
Eric Eaton · Martha White · Doina Precup · Irina Rish · Harm van Seijen -
2020 : Q&A by Rich Sutton »
Richard Sutton · Shagun Sodhani · Sarath Chandar -
2020 : Q&A with Irina Rish »
Irina Rish · Shagun Sodhani · Sarath Chandar -
2020 : Q&A with Jürgen Schmidhuber »
Jürgen Schmidhuber · Shagun Sodhani · Sarath Chandar -
2020 : Q&A with Partha Pratim Talukdar »
Partha Talukdar · Shagun Sodhani · Sarath Chandar -
2020 : Q&A with Katja Hoffman »
Katja Hofmann · Luisa Zintgraf · Rika Antonova · Sarath Chandar · Shagun Sodhani -
2020 Workshop: 4th Lifelong Learning Workshop »
Shagun Sodhani · Sarath Chandar · Balaraman Ravindran · Doina Precup -
2020 : Opening Comments »
Sarath Chandar · Shagun Sodhani -
2019 Poster: Dead-ends and Secure Exploration in Reinforcement Learning »
Mehdi Fatemi · Shikhar Sharma · Harm van Seijen · Samira Ebrahimi Kahou -
2019 Oral: Dead-ends and Secure Exploration in Reinforcement Learning »
Mehdi Fatemi · Shikhar Sharma · Harm van Seijen · Samira Ebrahimi Kahou -
2017 : Achieving Above-Human Performance on Ms. Pac-Man by Reward Decomposition »
Harm van Seijen