Timezone: »

 
Poster
Accountable Off-Policy Evaluation With Kernel Bellman Statistics
Yihao Feng · Tongzheng Ren · Ziyang Tang · Qiang Liu

Wed Jul 15 09:00 AM -- 09:45 AM & Wed Jul 15 08:00 PM -- 08:45 PM (PDT) @ None #None

Off-policy evaluation plays an important role in modern reinforcement learning. However, most existing off-policy evaluation algorithms focus on point estimation, without providing an account- able confidence interval, that can reflect the uncertainty caused by limited observed data and algorithmic errors. In this work, we propose a new optimization-based framework, which can find a feasible set that contains the true value function with high probability, by leveraging the statistical properties of the recent proposed kernel Bellman loss (Feng et al., 2019). We further utilize the feasible set to construct accountable confidence intervals for off-policy evaluations, and propose a post-hoc diagnosis for existing estimators. Empirical results show that our methods yield tight yet accountable confidence intervals in different settings, which demonstrate the effectiveness of our method.

Author Information

Yihao Feng (The University of Texas at Austin)
Tongzheng Ren (UT Austin)
Ziyang Tang (University of Texas at Austin)
Qiang Liu (UT Austin)

More from the Same Authors