Toggle Poster Visibility
Sat Jul 24 09:00 AM -- 09:25 AM (PDT)
Invited Speaker: Emilie Kaufmann: On pure-exploration in Markov Decision Processes
Sat Jul 24 09:30 AM -- 09:55 AM (PDT)
Invited Speaker: Christian Kroer: Recent Advances in Iterative Methods for Large-Scale Game Solving
Sat Jul 24 10:00 AM -- 10:12 AM (PDT)
Sparsity in the Partially Controllable LQR
Sat Jul 24 10:15 AM -- 10:27 AM (PDT)
On the Theory of Reinforcement Learning with Once-per-Episode Feedback
Sat Jul 24 10:30 AM -- 10:42 AM (PDT)
Implicit Finite-Horizon Approximation for Stochastic Shortest Path
Sat Jul 24 10:45 AM -- 10:57 AM (PDT)
Provable Benefits of Actor-Critic Methods for Offline Reinforcement Learning
Sat Jul 24 11:00 AM -- 11:25 AM (PDT)
Invited Speaker: Animashree Anandkumar: Stability-aware reinforcement learning in dynamical systems
Sat Jul 24 11:30 AM -- 11:55 AM (PDT)
Invited Speaker: Shie Mannor: Lenient Regret
Sat Jul 24 12:30 PM -- 02:00 PM (PDT)
Poster Session - I
Sat Jul 24 02:00 PM -- 02:25 PM (PDT)
Invited Speaker: Bo Dai: Leveraging Non-uniformity in Policy Gradient
Sat Jul 24 02:30 PM -- 02:55 PM (PDT)
Invited Speaker: Qiaomin Xie: Reinforcement Learning for Zero-Sum Markov Games Using Function Approximation and Correlated Equilibrium
Sat Jul 24 03:00 PM -- 03:12 PM (PDT)
Bad-Policy Density: A Measure of Reinforcement-Learning Hardness
Sat Jul 24 03:15 PM -- 03:27 PM (PDT)
Sample-Efficient Learning of Stackelberg Equilibria in General-Sum Games
Sat Jul 24 03:30 PM -- 03:42 PM (PDT)
Solving Multi-Arm Bandit Using a Few Bits of Communication
Sat Jul 24 03:45 PM -- 03:57 PM (PDT)
CRPO: A New Approach for Safe Reinforcement Learning with Convergence Guarantee
Sat Jul 24 04:00 PM -- 04:25 PM (PDT)
Invited Speaker: Art Owen: Empirical likelihood for reinforcement learning
Sat Jul 24 04:30 PM -- 05:00 PM (PDT)
Panel Session: Animashree Anandkumar, Christian Kroer, Art Owen, Qiaomin Xie
Sat Jul 24 05:30 PM -- 09:00 PM (PDT)
Poster Session - II
Multi-Task Offline Reinforcement Learning with Conservative Data Sharing
Bellman Eluder Dimension: New Rich Classes of RL Problems, and Sample-Efficient Algorithms
Bridging The Gap between Local and Joint Differential Privacy in RL
Learning Pareto-Optimal Policies in Low-Rank Cooperative Markov Games
The Power of Exploiter: Provable Multi-Agent RL in Large State Spaces
Model-based Offline Reinforcement Learning with Local Misspecification
Reward-Weighted Regression Converges to a Global Optimum
Optimistic Exploration with Backward Bootstrapped Bonus for Deep Reinforcement Learning
Reinforcement Learning in Linear MDPs: Constant Regret and Representation Selection
Global Convergence of Multi-Agent Policy Gradient in Markov Potential Games
Marginalized Operators for Off-Policy Reinforcement Learning
Online Sub-Sampling for Reinforcement Learning with General Function Approximation
CRPO: A New Approach for Safe Reinforcement Learning with Convergence Guarantee
Provably Efficient Multi-Task Reinforcement Learning with Model Transfer
Estimating Optimal Policy Value in Linear Contextual Bandits beyond Gaussianity
Mind the Gap: Safely Bridging Offline and Online Reinforcement Learning
Finite time analysis of temporal difference learning with linear function approximation: the tail averaged case
Policy Optimization in Adversarial MDPs: Improved Exploration via Dilated Bonuses
Value-Based Deep Reinforcement Learning Requires Explicit Regularization
Non-Stationary Representation Learning in Sequential Multi-Armed Bandits
Identification and Adaptive Control of Markov Jump Systems: Sample Complexity and Regret Bounds
Provable RL with Exogenous Distractors via Multistep Inverse Dynamics
Convergence and Optimality of Policy Gradient Methods in Weakly Smooth Settings
A Short Note on the Relationship of Information Gain and Eluder Dimension
Nearly Minimax Optimal Reinforcement Learning for Discounted MDPs
Policy Finetuning: Bridging Sample-Efficient Offline and Online Reinforcement Learning
A functional mirror ascent view of policy gradient methods with function approximation
Learning Nash Equilibria in Zero-Sum Stochastic Games via Entropy-Regularized Policy Approximation
A Spectral Approach to Off-Policy Evaluation for POMDPs
Stochastic Shortest Path: Minimax, Parameter-Free and Towards Horizon-Free Regret
Provably efficient exploration-free transfer RL for near-deterministic latent dynamics
Nearly Optimal Regret for Learning Adversarial MDPs with Linear Function Approximation
Near-Optimal Offline Reinforcement Learning via Double Variance Reduction
Optimal Uniform OPE and Model-based Offline Reinforcement Learning in Time-Homogeneous, Reward-Free and Task-Agnostic Settings
Model-Free Approach to Evaluate Reinforcement Learning Algorithms
Learning from an Exploring Demonstrator: Optimal Reward Estimation for Bandits
Nearly Minimax Optimal Regret for Learning Infinite-horizon Average-reward MDPs with Linear Function Approximation
Linear Convergence of Entropy-Regularized Natural Policy Gradient with Linear Function Approximation
Provable Model-based Nonlinear Bandit and Reinforcement Learning: Shelve Optimism, Embrace Virtual Curvature
Online Learning for Stochastic Shortest Path Model via Posterior Sampling
Statistical Inference with M-Estimators on Adaptively Collected Data
Why Generalization in RL is Difficult: Epistemic POMDPs and Implicit Partial Observability
Oracle-Efficient Regret Minimization in Factored MDPs with Unknown Structure
Optimal and instance-dependent oracle inequalities for policy evaluation
Sample Efficient Reinforcement Learning In Continuous State Spaces: A Perspective Beyond Linearity
The Importance of Non-Markovianity in Maximum State Entropy Exploration
Finite-Sample Analysis of Off-Policy Natural Actor-Critic With Linear Function Approximation
When Is Generalizable Reinforcement Learning Tractable?
Derivative-Free Policy Optimization for Linear Risk-Sensitive and Robust Control Design: Implicit Regularization and Sample Complexity
Nonstationary Reinforcement Learning with Linear Function Approximation
Collision Resolution in Multi-player Bandits Without Observing Collision Information
Subgaussian Importance Sampling for Off-Policy Evaluation and Learning
Sample Complexity of Offline Reinforcement Learning with Deep ReLU Networks
Triple-Q: A Model-Free Algorithm for Constrained Reinforcement Learning with Sublinear Regret and Zero Constraint Violation
Finding the Near Optimal Policy via Reductive Regularization in MDPs
Finite Sample Analysis of Average-Reward TD Learning and $Q$-Learning
A Fully Problem-Dependent Regret Lower Bound for Finite-Horizon MDPs
Efficient Inverse Reinforcement Learning of Transferable Rewards
Learning Stackelberg Equilibria in Sequential Price Mechanisms
Comparison and Unification of Three Regularization Methods in Batch Reinforcement Learning
A general sample complexity analysis of vanilla policy gradient
Almost Optimal Algorithms for Two-player Markov Games with Linear Function Approximation
The best of both worlds: stochastic and adversarial episodic MDPs with unknown transition
Finite-Sample Analysis of Off-Policy TD-Learning via Generalized Bellman Operators
Learning Adversarial Markov Decision Processes with Delayed Feedback
On the Theory of Reinforcement Learning with Once-per-Episode Feedback
Implicit Finite-Horizon Approximation for Stochastic Shortest Path
Bad-Policy Density: A Measure of Reinforcement-Learning Hardness
Gap-Dependent Unsupervised Exploration for Reinforcement Learning
Sample-Efficient Learning of Stackelberg Equilibria in General-Sum Games
Solving Multi-Arm Bandit Using a Few Bits of Communication
Provable Benefits of Actor-Critic Methods for Offline Reinforcement Learning
Successful Page Load