Timezone: »
Representation learning and exploration in reinforcement learning - Akshay Krishnamurthy
Akshay Krishnamurthy
I will discuss new provably efficient algorithms for reinforcement in rich observation environments with arbitrarly large state spaces. Both algorithms operate by learning succinct representations of the environment, which they use in an exploration module to acquire new information. The first algorithm, called Homer, operates in a block MDP model and uses a contrastive learning objective to learn the representation. On the other hand, the second algorithm, called FLAMBE, operates in a much richer class of low rank MDPs and is model based. Both algorithms accommodate nonlinear function approximation and enjoy provable sample and computational efficiency guarantees.
Author Information
Akshay Krishnamurthy (Microsoft Research)
More from the Same Authors
-
2021 : Provable RL with Exogenous Distractors via Multistep Inverse Dynamics »
Yonathan Efroni · Dipendra Misra · Akshay Krishnamurthy · Alekh Agarwal · John Langford -
2021 : Sparsity in the Partially Controllable LQR »
Yonathan Efroni · Sham Kakade · Akshay Krishnamurthy · Cyril Zhang -
2023 : Exposing Attention Glitches with Flip-Flop Language Modeling »
Bingbin Liu · Jordan Ash · Surbhi Goel · Akshay Krishnamurthy · Cyril Zhang -
2023 : Exposing Attention Glitches with Flip-Flop Language Modeling »
Bingbin Liu · Jordan Ash · Surbhi Goel · Akshay Krishnamurthy · Cyril Zhang -
2023 Poster: Streaming Active Learning with Deep Neural Networks »
Akanksha Saran · Safoora Yousefi · Akshay Krishnamurthy · John Langford · Jordan Ash -
2023 Poster: Statistical Learning under Heterogenous Distribution Shift »
Max Simchowitz · Anurag Ajay · Pulkit Agrawal · Akshay Krishnamurthy -
2022 Poster: Universal and data-adaptive algorithms for model selection in linear contextual bandits »
Vidya Muthukumar · Akshay Krishnamurthy -
2022 Spotlight: Universal and data-adaptive algorithms for model selection in linear contextual bandits »
Vidya Muthukumar · Akshay Krishnamurthy -
2022 Poster: Sparsity in Partially Controllable Linear Systems »
Yonathan Efroni · Sham Kakade · Akshay Krishnamurthy · Cyril Zhang -
2022 Poster: Understanding Contrastive Learning Requires Incorporating Inductive Biases »
Nikunj Umesh Saunshi · Jordan Ash · Surbhi Goel · Dipendra Kumar Misra · Cyril Zhang · Sanjeev Arora · Sham Kakade · Akshay Krishnamurthy -
2022 Spotlight: Sparsity in Partially Controllable Linear Systems »
Yonathan Efroni · Sham Kakade · Akshay Krishnamurthy · Cyril Zhang -
2022 Spotlight: Understanding Contrastive Learning Requires Incorporating Inductive Biases »
Nikunj Umesh Saunshi · Jordan Ash · Surbhi Goel · Dipendra Kumar Misra · Cyril Zhang · Sanjeev Arora · Sham Kakade · Akshay Krishnamurthy -
2022 Poster: Provable Reinforcement Learning with a Short-Term Memory »
Yonathan Efroni · Chi Jin · Akshay Krishnamurthy · Sobhan Miryoosefi -
2022 Spotlight: Provable Reinforcement Learning with a Short-Term Memory »
Yonathan Efroni · Chi Jin · Akshay Krishnamurthy · Sobhan Miryoosefi -
2021 : Sparsity in the Partially Controllable LQR »
Yonathan Efroni · Sham Kakade · Akshay Krishnamurthy · Cyril Zhang -
2020 : Speaker Panel »
Csaba Szepesvari · Martha White · Sham Kakade · Gergely Neu · Shipra Agrawal · Akshay Krishnamurthy -
2020 Poster: Doubly robust off-policy evaluation with shrinkage »
Yi Su · Maria Dimakopoulou · Akshay Krishnamurthy · Miroslav Dudik -
2020 Poster: Kinematic State Abstraction and Provably Efficient Rich-Observation Reinforcement Learning »
Dipendra Kumar Misra · Mikael Henaff · Akshay Krishnamurthy · John Langford -
2020 Poster: Reward-Free Exploration for Reinforcement Learning »
Chi Jin · Akshay Krishnamurthy · Max Simchowitz · Tiancheng Yu -
2020 Poster: Adaptive Estimator Selection for Off-Policy Evaluation »
Yi Su · Pavithra Srinath · Akshay Krishnamurthy -
2020 Poster: Private Reinforcement Learning with PAC and Regret Guarantees »
Giuseppe Vietri · Borja de Balle Pigem · Akshay Krishnamurthy · Steven Wu -
2019 Poster: Myopic Posterior Sampling for Adaptive Goal Oriented Design of Experiments »
Kirthevasan Kandasamy · Willie Neiswanger · Reed Zhang · Akshay Krishnamurthy · Jeff Schneider · Barnabás Póczos -
2019 Oral: Myopic Posterior Sampling for Adaptive Goal Oriented Design of Experiments »
Kirthevasan Kandasamy · Willie Neiswanger · Reed Zhang · Akshay Krishnamurthy · Jeff Schneider · Barnabás Póczos -
2019 Poster: Provably efficient RL with Rich Observations via Latent State Decoding »
Simon Du · Akshay Krishnamurthy · Nan Jiang · Alekh Agarwal · Miroslav Dudik · John Langford -
2019 Oral: Provably efficient RL with Rich Observations via Latent State Decoding »
Simon Du · Akshay Krishnamurthy · Nan Jiang · Alekh Agarwal · Miroslav Dudik · John Langford -
2018 Poster: Semiparametric Contextual Bandits »
Akshay Krishnamurthy · Steven Wu · Vasilis Syrgkanis -
2018 Oral: Semiparametric Contextual Bandits »
Akshay Krishnamurthy · Steven Wu · Vasilis Syrgkanis -
2017 Poster: Contextual Decision Processes with low Bellman rank are PAC-Learnable »
Nan Jiang · Akshay Krishnamurthy · Alekh Agarwal · John Langford · Robert Schapire -
2017 Talk: Contextual Decision Processes with low Bellman rank are PAC-Learnable »
Nan Jiang · Akshay Krishnamurthy · Alekh Agarwal · John Langford · Robert Schapire -
2017 Poster: Active Learning for Cost-Sensitive Classification »
Akshay Krishnamurthy · Alekh Agarwal · Tzu-Kuo Huang · Hal Daumé III · John Langford -
2017 Talk: Active Learning for Cost-Sensitive Classification »
Akshay Krishnamurthy · Alekh Agarwal · Tzu-Kuo Huang · Hal Daumé III · John Langford