Talk
in
Workshop: Theoretical Foundations of Reinforcement Learning
An Off-policy Policy Gradient Theorem: A Tale About Weightings - Martha White
Martha White
Abstract:
The goal of the talk is to discuss our recent work on an off-policy policy gradient theorem, and how it can help leverage theory for the on-policy setting for use in the off-policy setting. The key insight is to provide a more general objective for the off-policy setting, that encompasses the on-policy episodic objective. These simple generalizations make it straightforward to port and generalize ideas.
Chat is not available.