Poster
Kernel-Based Reinforcement Learning in Robust Markov Decision Processes
Shiau Hong Lim · Arnaud Autef
Pacific Ballroom #120
Keywords: [ Robust Statistics and Machine Learning ] [ Safety ] [ Theory and Algorithms ]
The robust Markov decision processes (MDP) framework aims to address the problem of parameter uncertainty due to model mismatch, approximation errors or even adversarial behaviors. It is especially relevant when deploying the learned policies in real-world applications. Scaling up the robust MDP framework to large or continuous state space remains a challenging problem. The use of function approximation in this case is usually inevitable and this can only amplify the problem of model mismatch and parameter uncertainties. It has been previously shown that, in the case of MDPs with state aggregation, the robust policies enjoy a tighter performance bound compared to standard solutions due to its reduced sensitivity to approximation errors. We extend these results to the much larger class of kernel-based approximators and show, both analytically and empirically that the robust policies can significantly outperform the non-robust counterpart.
Live content is unavailable. Log in and register to view live content