Skip to yearly menu bar Skip to main content


Oral

Composing Value Functions in Reinforcement Learning

Benjamin van Niekerk · Steven James · Adam Earle · Benjamin Rosman

Abstract:

An important property for lifelong-learning agents is the ability to combine existing skills to solve new unseen tasks. In general, however, it is unclear how to compose existing skills in a principled manner. We show that optimal value function composition can be achieved in entropy-regularised reinforcement learning (RL), and then extend this result to the standard RL setting. Composition is demonstrated in a high-dimensional video game environment, where an agent with an existing library of skills is immediately able to solve new tasks without the need for further learning.

Chat is not available.