24 Results

Poster
Tue 7:00 From Importance Sampling to Doubly Robust Policy Gradient
Jiawei Huang, Nan Jiang
Poster
Tue 8:00 GradientDICE: Rethinking Generalized Offline Estimation of Stationary Values
Shangtong Zhang, Bo Liu, Shimon Whiteson
Poster
Tue 8:00 Learning Adversarial Markov Decision Processes with Bandit Feedback and Unknown Transition
Chi Jin, Tiancheng Jin, Haipeng Luo, Suvrit Sra, Tiancheng Yu
Poster
Tue 10:00 Logarithmic Regret for Adversarial Online Control
Dylan Foster, Max Simchowitz
Poster
Tue 12:00 Understanding the Curse of Horizon in Off-Policy Evaluation via Conditional Importance Sampling
Yao Liu, Pierre-Luc Bacon, Emma Brunskill
Poster
Tue 13:00 Learning with Good Feature Representations in Bandits and in RL with a Generative Model
Tor Lattimore, Csaba Szepesvari, Gellért Weisz
Poster
Tue 14:00 Logarithmic Regret for Learning Linear Quadratic Regulators Efficiently
Asaf Cassel, Alon Cohen, Tomer Koren
Poster
Tue 15:00 Double Reinforcement Learning for Efficient and Robust Off-Policy Evaluation
Nathan Kallus, Masatoshi Uehara
Poster
Wed 5:00 Adaptive Estimator Selection for Off-Policy Evaluation
Yi Su, Pavithra Srinath, Akshay Krishnamurthy
Poster
Wed 9:00 Provable Self-Play Algorithms for Competitive Reinforcement Learning
Yu Bai, Chi Jin
Poster
Wed 10:00 Provable Representation Learning for Imitation Learning via Bi-level Optimization
Sanjeev Arora, Simon Du, Sham Kakade, Yuping Luo, Nikunj Saunshi
Poster
Wed 11:00 Learning Near Optimal Policies with Low Inherent Bellman Error
Andrea Zanette, Alessandro Lazaric, Mykel Kochenderfer, Emma Brunskill
Poster
Wed 14:00 Model-free Reinforcement Learning in Infinite-horizon Average-reward Markov Decision Processes
Chen-Yu Wei, Mehdi Jafarnia, Haipeng Luo, Hiteshi Sharma, Rahul Jain
Poster
Wed 15:00 Statistically Efficient Off-Policy Policy Gradients
Nathan Kallus, Masatoshi Uehara
Poster
Thu 6:00 Reward-Free Exploration for Reinforcement Learning
Chi Jin, Akshay Krishnamurthy, Max Simchowitz, Tiancheng Yu
Poster
Thu 6:00 On the Expressivity of Neural Networks for Deep Reinforcement Learning
Kefan Dong, Yuping Luo, Tianhe (Kevin) Yu, Chelsea Finn, Tengyu Ma
Poster
Thu 7:00 Reinforcement Learning in Feature Space: Matrix Bandit, Kernels, and Regret Bound
Lin Yang, Mengdi Wang
Poster
Thu 7:00 Kinematic State Abstraction and Provably Efficient Rich-Observation Reinforcement Learning
Dipendra Kumar Misra, Mikael Henaff, Akshay Krishnamurthy, John Langford
Poster
Thu 7:00 Model-Based Reinforcement Learning with Value-Targeted Regression
Alex Ayoub, Zeyu Jia, Csaba Szepesvari, Mengdi Wang, Lin Yang
Poster
Thu 8:00 Provably Convergent Two-Timescale Off-Policy Actor-Critic with Function Approximation
Shangtong Zhang, Bo Liu, Hengshuai Yao, Shimon Whiteson
Poster
Thu 9:00 On the Global Convergence Rates of Softmax Policy Gradient Methods
Jincheng Mei, Chenjun Xiao, Csaba Szepesvari, Dale Schuurmans
Poster
Thu 12:00 Naive Exploration is Optimal for Online LQR
Max Simchowitz, Dylan Foster
Poster
Thu 14:00 Optimistic Policy Optimization with Bandit Feedback
Lior Shani, Yonathan Efroni, Aviv Rosenberg, Shie Mannor
Poster
Thu 17:00 Minimax Weight and Q-Function Learning for Off-Policy Evaluation
Masatoshi Uehara, Jiawei Huang, Nan Jiang