Skip to yearly menu bar Skip to main content


Composing Value Functions in Reinforcement Learning

Benjamin van Niekerk · Steven James · Adam Earle · Benjamin Rosman

Pacific Ballroom #251

Keywords: [ Transfer and Multitask Learning ] [ Theory and Algorithms ] [ Deep Reinforcement Learning ]


An important property for lifelong-learning agents is the ability to combine existing skills to solve new unseen tasks. In general, however, it is unclear how to compose existing skills in a principled manner. Under the assumption of deterministic dynamics, we prove that optimal value function composition can be achieved in entropy-regularised reinforcement learning (RL), and extend this result to the standard RL setting. Composition is demonstrated in a high-dimensional video game, where an agent with an existing library of skills is immediately able to solve new tasks without the need for further learning.

Live content is unavailable. Log in and register to view live content