Timezone: »

Per-Decision Option Discounting
Anna Harutyunyan · Peter Vrancx · Philippe Hamel · Ann Nowe · Doina Precup

Tue Jun 11 06:30 PM -- 09:00 PM (PDT) @ Pacific Ballroom #114

In order to solve complex problems an agent must be able to reason over a sufficiently long horizon. Temporal abstraction, commonly modeled through options, offers the ability to reason at many timescales, but the horizon length is still determined by the discount factor of the underlying Markov Decision Process. We propose a modification to the options framework that naturally scales the agent's horizon with option length. We show that the proposed option-step discount controls a bias-variance trade-off, with larger discounts (counter-intuitively) leading to less estimation variance.

Author Information

Anna Harutyunyan (DeepMind)
Peter Vrancx (PROWLER.io)
Philippe Hamel (Deepmind)
Ann Nowe (VU Brussel)
Doina Precup (DeepMind)

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors