Poster
in
Workshop: Workshop on Reinforcement Learning Theory
Finite time analysis of temporal difference learning with linear function approximation: the tail averaged case
Gandharv Patil · Prashanth L.A. · Doina Precup
Abstract:
In this paper, we study the finite-time behaviour of temporal difference (TD) learning algorithms when combined with tail-averaging, and present instance dependent bounds on the parameter error of the tail-averaged TD iterate. Our error bounds hold in expectation as well as with high probability, exhibit a sharper rate of decay for the initial error (bias), and are comparable with existing bounds in the literature.
Chat is not available.