Timezone: »
Poster
Adaptive Estimator Selection for Off-Policy Evaluation
Yi Su · Pavithra Srinath · Akshay Krishnamurthy
Wed Jul 15 05:00 AM -- 05:45 AM & Wed Jul 15 04:00 PM -- 04:45 PM (PDT) @ None #None
We develop a generic data-driven method for estimator selection in off-policy policy evaluation settings. We establish a strong performance guarantee for the method, showing that it is competitive with the oracle estimator, up to a constant factor. Via in-depth case studies in contextual bandits and reinforcement learning, we demonstrate the generality and applicability of the method. We also perform comprehensive experiments, demonstrating the empirical efficacy of our approach and comparing with related approaches. In both case studies, our method compares favorably with existing methods.
Author Information
Yi Su (Cornell University)
Pavithra Srinath (Microsoft Research)
Akshay Krishnamurthy (Microsoft Research)
More from the Same Authors
-
2020 Poster: Doubly robust off-policy evaluation with shrinkage »
Yi Su · Maria Dimakopoulou · Akshay Krishnamurthy · Miroslav Dudik -
2020 Poster: Kinematic State Abstraction and Provably Efficient Rich-Observation Reinforcement Learning »
Dipendra Misra · Mikael Henaff · Akshay Krishnamurthy · John Langford -
2020 Poster: Reward-Free Exploration for Reinforcement Learning »
Chi Jin · Akshay Krishnamurthy · Max Simchowitz · Tiancheng Yu -
2020 Poster: Private Reinforcement Learning with PAC and Regret Guarantees »
Giuseppe Vietri · Borja de Balle Pigem · Akshay Krishnamurthy · Steven Wu -
2019 Poster: CAB: Continuous Adaptive Blending for Policy Evaluation and Learning »
Yi Su · Lequn Wang · Michele Santacatterina · Thorsten Joachims -
2019 Oral: CAB: Continuous Adaptive Blending for Policy Evaluation and Learning »
Yi Su · Lequn Wang · Michele Santacatterina · Thorsten Joachims -
2019 Poster: Myopic Posterior Sampling for Adaptive Goal Oriented Design of Experiments »
Kirthevasan Kandasamy · Willie Neiswanger · Reed Zhang · Akshay Krishnamurthy · Jeff Schneider · Barnabás Póczos -
2019 Oral: Myopic Posterior Sampling for Adaptive Goal Oriented Design of Experiments »
Kirthevasan Kandasamy · Willie Neiswanger · Reed Zhang · Akshay Krishnamurthy · Jeff Schneider · Barnabás Póczos -
2019 Poster: Provably efficient RL with Rich Observations via Latent State Decoding »
Simon Du · Akshay Krishnamurthy · Nan Jiang · Alekh Agarwal · Miroslav Dudik · John Langford -
2019 Oral: Provably efficient RL with Rich Observations via Latent State Decoding »
Simon Du · Akshay Krishnamurthy · Nan Jiang · Alekh Agarwal · Miroslav Dudik · John Langford -
2018 Poster: Semiparametric Contextual Bandits »
Akshay Krishnamurthy · Steven Wu · Vasilis Syrgkanis -
2018 Oral: Semiparametric Contextual Bandits »
Akshay Krishnamurthy · Steven Wu · Vasilis Syrgkanis -
2017 Poster: Contextual Decision Processes with low Bellman rank are PAC-Learnable »
Nan Jiang · Akshay Krishnamurthy · Alekh Agarwal · John Langford · Robert Schapire -
2017 Talk: Contextual Decision Processes with low Bellman rank are PAC-Learnable »
Nan Jiang · Akshay Krishnamurthy · Alekh Agarwal · John Langford · Robert Schapire -
2017 Poster: Active Learning for Cost-Sensitive Classification »
Akshay Krishnamurthy · Alekh Agarwal · Tzu-Kuo Huang · Hal Daumé III · John Langford -
2017 Talk: Active Learning for Cost-Sensitive Classification »
Akshay Krishnamurthy · Alekh Agarwal · Tzu-Kuo Huang · Hal Daumé III · John Langford