Poster
in
Workshop: Workshop on Reinforcement Learning Theory
Provable Benefits of Actor-Critic Methods for Offline Reinforcement Learning
Andrea Zanette · Martin Wainwright · Emma Brunskill
Abstract:
Actor-critic methods are widely used in offline reinforcement learning practice but are understudied theoretically. In this work we show that the pessimism principle can be naturally incorporated into actor-critic formulations. We create an offline actor-critic algorithm for a linear MDP model more general than the low-rank model. The procedure is both minimax optimal and computationally tractable.
Chat is not available.