ICML CPeSFA: Empowering SFs for Policy Learning and Transfer in Continuous Action Spaces

Poster
in
Workshop: Foundations of Reinforcement Learning and Control: Connections and Perspectives

CPeSFA: Empowering SFs for Policy Learning and Transfer in Continuous Action Spaces

Yining Li · Tianpei Yang · Wei Guo · Jianye Hao · Yan Zheng

[ Abstract ] [ Project Page ]

[ OpenReview]

Abstract:

Successor Features (SFs), together with Generalized Policy Improvement (GPI), comprise a conventional transfer Reinforcement Learning (RL) algorithm, which can transfer knowledge using the characteristic of decoupling policy with the task. However, SFs are value-based and cannot handle environments with continuous action spaces since GPI cannot transfer knowledge by traversing all possible actions in such a case. Recently, PeSFA decouples SFs from policies and further endows SFs with generalization capabilities in the policy space. However, it cannot be applied to continuous action spaces. In this paper, we introduce the Continuous PeSFA (CPeSFA) algorithm, an Actor-Critic (AC) architecture designed for learning and transferring policies in continuous action spaces. Our theoretical analysis shows that CPeSFA leverages SFs' generalization in the policy space to accelerate learning rates. Experimental results across Grid World, Reacher, and Point Maze environments demonstrate CPeSFA's superiority and effective knowledge transfer for rapid policy learning in new tasks.

Chat is not available.

Poster in Workshop: Foundations of Reinforcement Learning and Control: Connections and Perspectives

CPeSFA: Empowering SFs for Policy Learning and Transfer in Continuous Action Spaces

Yining Li · Tianpei Yang · Wei Guo · Jianye Hao · Yan Zheng

Poster
in
Workshop: Foundations of Reinforcement Learning and Control: Connections and Perspectives