ICML Discuss
Monte Carlo Bayesian Reinforcement Learning
by Yi Wang, Kok Sung Won, David Hsu, Wee Sun Lee at ICML 2012
Bayesian reinforcement learning (BRL) encodes prior knowledge of the world in a model and represents uncertainty in model parameters by maintaining a posterior distribution over them. This paper proposes a simple and general approach to BRL. The idea is to sample a priori a finite set of hypotheses for the model parameter values and form a discrete partially observable Markov decision process (POMDP) whose state space is a cross product of the state space for the reinforcement learning task and the sampled model parameter space. The POMDP does not require conjugate distributions for belief representation, as earlier works do, and can be solved relatively easily with point-based approximation algorithms. Our approach naturally handles both fully and partially observable worlds. Theoretical and experimental results show that the discrete POMDP approximates the underlying BRL problem well with guaranteed performance.

Related Material

Download PDF Watch Video

Discussion

Email notifications of comments are sent to authors.
Please use the feedback page to report broken links and other problems.
blog comments powered by Disqus