Timezone: »
Approximate policy iteration is a class of reinforcement learning (RL) algorithms where the policy is encoded using a function approximator and which has been especially prominent in RL with continuous action spaces. In this class of RL algorithms, ensuring increase of the policy return during policy update often requires to constrain the change in action distribution. Several approximations exist in the literature to solve this constrained policy update problem. In this paper, we propose to improve over such solutions by introducing a set of projections that transform the constrained problem into an unconstrained one which is then solved by standard gradient descent. Using these projections, we empirically demonstrate that our approach can improve the policy update solution and the control over exploration of existing approximate policy iteration algorithms.
Author Information
Riad Akrour (TU Darmstadt)
Joni Pajarinen (TU Darmstadt)
Jan Peters (TU Darmstadt + Max Planck Institute for Intelligent Systems)
Gerhard Neumann (University of Lincoln)
Related Events (a corresponding poster, oral, or spotlight)
-
2019 Oral: Projections for Approximate Policy Iteration Algorithms »
Thu. Jun 13th 04:30 -- 04:35 PM Room Hall B
More from the Same Authors
-
2021 : Exploration via Empowerment Gain: Combining Novelty, Surprise and Learning Progress »
Philip Becker-Ehmck · Maximilian Karl · Jan Peters · Patrick van der Smagt -
2023 : Parameterized projected Bellman operator »
Théo Vincent · Alberto Maria Metelli · Jan Peters · Marcello Restelli · Carlo D'Eramo -
2022 Poster: Curriculum Reinforcement Learning via Constrained Optimal Transport »
Pascal Klink · Haoyi Yang · Carlo D'Eramo · Jan Peters · Joni Pajarinen -
2022 Spotlight: Curriculum Reinforcement Learning via Constrained Optimal Transport »
Pascal Klink · Haoyi Yang · Carlo D'Eramo · Jan Peters · Joni Pajarinen -
2021 : RL + Robotics Panel »
George Konidaris · Jan Peters · Martin Riedmiller · Angela Schoellig · Rose Yu · Rupam Mahmood -
2021 Poster: Value Iteration in Continuous Actions, States and Time »
Michael Lutter · Shie Mannor · Jan Peters · Dieter Fox · Animesh Garg -
2021 Spotlight: Value Iteration in Continuous Actions, States and Time »
Michael Lutter · Shie Mannor · Jan Peters · Dieter Fox · Animesh Garg -
2021 Poster: Convex Regularization in Monte-Carlo Tree Search »
Tuan Q Dam · Carlo D'Eramo · Jan Peters · Joni Pajarinen -
2021 Spotlight: Convex Regularization in Monte-Carlo Tree Search »
Tuan Q Dam · Carlo D'Eramo · Jan Peters · Joni Pajarinen -
2018 Poster: PIPPS: Flexible Model-Based Policy Search Robust to the Curse of Chaos »
Paavo Parmas · Carl E Rasmussen · Jan Peters · Kenji Doya -
2018 Oral: PIPPS: Flexible Model-Based Policy Search Robust to the Curse of Chaos »
Paavo Parmas · Carl E Rasmussen · Jan Peters · Kenji Doya -
2018 Poster: Efficient Gradient-Free Variational Inference using Policy Search »
Oleg Arenz · Gerhard Neumann · Mingjun Zhong -
2018 Oral: Efficient Gradient-Free Variational Inference using Policy Search »
Oleg Arenz · Gerhard Neumann · Mingjun Zhong -
2017 Poster: Local Bayesian Optimization of Motor Skills »
Riadh Akrour · Dmitry Sorokin · Jan Peters · Gerhard Neumann -
2017 Talk: Local Bayesian Optimization of Motor Skills »
Riadh Akrour · Dmitry Sorokin · Jan Peters · Gerhard Neumann