Timezone: »
Oral
Per-Decision Option Discounting
Anna Harutyunyan · Peter Vrancx · Philippe Hamel · Ann Nowe · Doina Precup
In order to solve complex problems, an agent must be able to reason over a sufficiently long horizon. Temporal abstraction, commonly modeled through options, offers the ability to reason at many time scales, but the horizon length is still determined by the single discount factor of the underlying Markov Decision Process. We propose a modification to the options framework that allows the agent’s horizon to grow naturally as its actions become more complex and extended in time. We show that the proposed option-step discount controls a bias-variance trade-off, with larger discounts (counter-intuitively) leading to less estimation variance.
Author Information
Anna Harutyunyan (DeepMind)
Peter Vrancx (PROWLER.io)
Philippe Hamel (Deepmind)
Ann Nowe (VU Brussel)
Doina Precup (DeepMind)
Related Events (a corresponding poster, oral, or spotlight)
-
2019 Poster: Per-Decision Option Discounting »
Wed. Jun 12th 01:30 -- 04:00 AM Room Pacific Ballroom #114
More from the Same Authors
-
2021 : Gradient Starvation: A Learning Proclivity in Neural Networks »
Mohammad Pezeshki · Sékou-Oumar Kaba · Yoshua Bengio · Aaron Courville · Doina Precup · Guillaume Lajoie -
2023 Poster: Bootstrapped Representations in Reinforcement Learning »
Charline Le Lan · Stephen Tu · Mark Rowland · Anna Harutyunyan · Rishabh Agarwal · Marc Bellemare · Will Dabney -
2023 Poster: DoMo-AC: Doubly Multi-step Off-policy Actor-Critic Algorithm »
Yunhao Tang · Tadashi Kozuno · Mark Rowland · Anna Harutyunyan · Remi Munos · Bernardo Avila Pires · Michal Valko -
2022 Poster: Proving Theorems using Incremental Learning and Hindsight Experience Replay »
Eser Aygün · Ankit Anand · Laurent Orseau · Xavier Glorot · Stephen McAleer · Vlad Firoiu · Lei Zhang · Doina Precup · Shibl Mourad -
2022 Spotlight: Proving Theorems using Incremental Learning and Hindsight Experience Replay »
Eser Aygün · Ankit Anand · Laurent Orseau · Xavier Glorot · Stephen McAleer · Vlad Firoiu · Lei Zhang · Doina Precup · Shibl Mourad -
2021 Poster: Counterfactual Credit Assignment in Model-Free Reinforcement Learning »
Thomas Mesnard · Theophane Weber · Fabio Viola · Shantanu Thakoor · Alaa Saade · Anna Harutyunyan · Will Dabney · Thomas Stepleton · Nicolas Heess · Arthur Guez · Eric Moulines · Marcus Hutter · Lars Buesing · Remi Munos -
2021 Spotlight: Counterfactual Credit Assignment in Model-Free Reinforcement Learning »
Thomas Mesnard · Theophane Weber · Fabio Viola · Shantanu Thakoor · Alaa Saade · Anna Harutyunyan · Will Dabney · Thomas Stepleton · Nicolas Heess · Arthur Guez · Eric Moulines · Marcus Hutter · Lars Buesing · Remi Munos -
2020 Poster: What can I do here? A Theory of Affordances in Reinforcement Learning »
Khimya Khetarpal · Zafarali Ahmed · Gheorghe Comanici · David Abel · Doina Precup -
2020 Poster: Batch Reinforcement Learning with Hyperparameter Gradients »
Byung-Jun Lee · Jongmin Lee · Peter Vrancx · Dongho Kim · Kee-Eung Kim