ICML Comparison and Unification of Three Regularization Methods in Batch Reinforcement Learning

Poster
in
Workshop: Workshop on Reinforcement Learning Theory

Comparison and Unification of Three Regularization Methods in Batch Reinforcement Learning

Sarah Rathnam

[ Abstract ]

[ Visit Poster at Spot B5 in Virtual World ]

Abstract:

In batch reinforcement learning, there can be poorly explored state-action pairs resulting in poorly learned, inaccurate models and poorly performing associated policies. Various regularization methods can mitigate the problem of learning overly-complex models in Markov decision processes (MDPs), however they operate in technically and intuitively distinct ways and lack a common form in which to compare them. This paper unifies three regularization methods in a common framework-- a weighted average transition matrix. Considering regularization methods in this common form illuminates how the MDP structure and the state-action pair distribution of the batch data set influence the relative performance of regularization methods. We confirm intuitions generated from the common framework by empirical evaluation across a range of MDPs and data collection policies.

Chat is not available.

Poster in Workshop: Workshop on Reinforcement Learning Theory

Comparison and Unification of Three Regularization Methods in Batch Reinforcement Learning

Sarah Rathnam

Poster
in
Workshop: Workshop on Reinforcement Learning Theory