Timezone: »

Projections for Approximate Policy Iteration Algorithms
Riad Akrour · Joni Pajarinen · Jan Peters · Gerhard Neumann

Thu Jun 13 09:30 AM -- 09:35 AM (PDT) @ Hall B

Approximate policy iteration is a class of reinforcement learning algorithms where both the value function and policy are encoded using function approximators and which has been especially prominent in continuous action spaces. However, by encoding the policy with a function approximator, it often becomes necessary to constrain the change in action distribution during policy update to ensure increase of the policy return. Several approximations exist in the literature to solve this constrained policy update problem. In this paper, we propose to improve over such solutions by introducing a set of projections that transform the constrained problem into an unconstrained one which is then solved by standard gradient descent. Using these projections, we empirically demonstrate that our approach can both improve the policy update solution and the control over exploration of existing approximate policy iteration algorithms.

Author Information

Riad Akrour (TU Darmstadt)
Joni Pajarinen (TU Darmstadt)
Jan Peters (TU Darmstadt + Max Planck Institute for Intelligent Systems)
Gerhard Neumann (University of Lincoln)

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors