Toggle Poster Visibility
Sun Jul 25 01:00 AM -- 01:25 AM (KST)
Invited Speaker: Emilie Kaufmann: On pure-exploration in Markov Decision Processes
Sun Jul 25 01:30 AM -- 01:55 AM (KST)
Invited Speaker: Christian Kroer: Recent Advances in Iterative Methods for Large-Scale Game Solving
Sun Jul 25 02:00 AM -- 02:12 AM (KST)
Sparsity in the Partially Controllable LQR
Sun Jul 25 02:15 AM -- 02:27 AM (KST)
On the Theory of Reinforcement Learning with Once-per-Episode Feedback
Sun Jul 25 02:30 AM -- 02:42 AM (KST)
Implicit Finite-Horizon Approximation for Stochastic Shortest Path
Sun Jul 25 02:45 AM -- 02:57 AM (KST)
Provable Benefits of Actor-Critic Methods for Offline Reinforcement Learning
Sun Jul 25 03:00 AM -- 03:25 AM (KST)
Invited Speaker: Animashree Anandkumar: Stability-aware reinforcement learning in dynamical systems
Sun Jul 25 03:30 AM -- 03:55 AM (KST)
Invited Speaker: Shie Mannor: Lenient Regret
Sun Jul 25 04:30 AM -- 06:00 AM (KST)
Poster Session - I
Sun Jul 25 06:00 AM -- 06:25 AM (KST)
Invited Speaker: Bo Dai: Leveraging Non-uniformity in Policy Gradient
Sun Jul 25 06:30 AM -- 06:55 AM (KST)
Invited Speaker: Qiaomin Xie: Reinforcement Learning for Zero-Sum Markov Games Using Function Approximation and Correlated Equilibrium
Sun Jul 25 07:00 AM -- 07:12 AM (KST)
Bad-Policy Density: A Measure of Reinforcement-Learning Hardness
Sun Jul 25 07:15 AM -- 07:27 AM (KST)
Sample-Efficient Learning of Stackelberg Equilibria in General-Sum Games
Sun Jul 25 07:30 AM -- 07:42 AM (KST)
Solving Multi-Arm Bandit Using a Few Bits of Communication
Sun Jul 25 07:45 AM -- 07:57 AM (KST)
CRPO: A New Approach for Safe Reinforcement Learning with Convergence Guarantee
Sun Jul 25 08:00 AM -- 08:25 AM (KST)
Invited Speaker: Art Owen: Empirical likelihood for reinforcement learning
Sun Jul 25 08:30 AM -- 09:00 AM (KST)
Panel Session: Animashree Anandkumar, Christian Kroer, Art Owen, Qiaomin Xie
Sun Jul 25 09:30 AM -- 01:00 PM (KST)
Poster Session - II
Multi-Task Offline Reinforcement Learning with Conservative Data Sharing
Bellman Eluder Dimension: New Rich Classes of RL Problems, and Sample-Efficient Algorithms
Bridging The Gap between Local and Joint Differential Privacy in RL
Learning Pareto-Optimal Policies in Low-Rank Cooperative Markov Games
The Power of Exploiter: Provable Multi-Agent RL in Large State Spaces
Model-based Offline Reinforcement Learning with Local Misspecification
Reward-Weighted Regression Converges to a Global Optimum
Optimistic Exploration with Backward Bootstrapped Bonus for Deep Reinforcement Learning
Reinforcement Learning in Linear MDPs: Constant Regret and Representation Selection
Global Convergence of Multi-Agent Policy Gradient in Markov Potential Games
Marginalized Operators for Off-Policy Reinforcement Learning
Online Sub-Sampling for Reinforcement Learning with General Function Approximation
CRPO: A New Approach for Safe Reinforcement Learning with Convergence Guarantee
Provably Efficient Multi-Task Reinforcement Learning with Model Transfer
Estimating Optimal Policy Value in Linear Contextual Bandits beyond Gaussianity
Mind the Gap: Safely Bridging Offline and Online Reinforcement Learning
Finite time analysis of temporal difference learning with linear function approximation: the tail averaged case
Policy Optimization in Adversarial MDPs: Improved Exploration via Dilated Bonuses
Value-Based Deep Reinforcement Learning Requires Explicit Regularization
Non-Stationary Representation Learning in Sequential Multi-Armed Bandits
Identification and Adaptive Control of Markov Jump Systems: Sample Complexity and Regret Bounds
Provable RL with Exogenous Distractors via Multistep Inverse Dynamics
Convergence and Optimality of Policy Gradient Methods in Weakly Smooth Settings
A Short Note on the Relationship of Information Gain and Eluder Dimension
Nearly Minimax Optimal Reinforcement Learning for Discounted MDPs
Policy Finetuning: Bridging Sample-Efficient Offline and Online Reinforcement Learning
A functional mirror ascent view of policy gradient methods with function approximation
Learning Nash Equilibria in Zero-Sum Stochastic Games via Entropy-Regularized Policy Approximation
A Spectral Approach to Off-Policy Evaluation for POMDPs
Stochastic Shortest Path: Minimax, Parameter-Free and Towards Horizon-Free Regret
Provably efficient exploration-free transfer RL for near-deterministic latent dynamics
Nearly Optimal Regret for Learning Adversarial MDPs with Linear Function Approximation
Near-Optimal Offline Reinforcement Learning via Double Variance Reduction
Optimal Uniform OPE and Model-based Offline Reinforcement Learning in Time-Homogeneous, Reward-Free and Task-Agnostic Settings
Model-Free Approach to Evaluate Reinforcement Learning Algorithms
Learning from an Exploring Demonstrator: Optimal Reward Estimation for Bandits
Nearly Minimax Optimal Regret for Learning Infinite-horizon Average-reward MDPs with Linear Function Approximation
Linear Convergence of Entropy-Regularized Natural Policy Gradient with Linear Function Approximation
Provable Model-based Nonlinear Bandit and Reinforcement Learning: Shelve Optimism, Embrace Virtual Curvature
Online Learning for Stochastic Shortest Path Model via Posterior Sampling
Statistical Inference with M-Estimators on Adaptively Collected Data
Why Generalization in RL is Difficult: Epistemic POMDPs and Implicit Partial Observability
Oracle-Efficient Regret Minimization in Factored MDPs with Unknown Structure
Optimal and instance-dependent oracle inequalities for policy evaluation
Sample Efficient Reinforcement Learning In Continuous State Spaces: A Perspective Beyond Linearity
The Importance of Non-Markovianity in Maximum State Entropy Exploration
Finite-Sample Analysis of Off-Policy Natural Actor-Critic With Linear Function Approximation
When Is Generalizable Reinforcement Learning Tractable?
Derivative-Free Policy Optimization for Linear Risk-Sensitive and Robust Control Design: Implicit Regularization and Sample Complexity
Nonstationary Reinforcement Learning with Linear Function Approximation
Collision Resolution in Multi-player Bandits Without Observing Collision Information
Subgaussian Importance Sampling for Off-Policy Evaluation and Learning
Sample Complexity of Offline Reinforcement Learning with Deep ReLU Networks
Triple-Q: A Model-Free Algorithm for Constrained Reinforcement Learning with Sublinear Regret and Zero Constraint Violation
Finding the Near Optimal Policy via Reductive Regularization in MDPs
Finite Sample Analysis of Average-Reward TD Learning and $Q$-Learning
A Fully Problem-Dependent Regret Lower Bound for Finite-Horizon MDPs
Efficient Inverse Reinforcement Learning of Transferable Rewards
Learning Stackelberg Equilibria in Sequential Price Mechanisms
Comparison and Unification of Three Regularization Methods in Batch Reinforcement Learning
A general sample complexity analysis of vanilla policy gradient
Almost Optimal Algorithms for Two-player Markov Games with Linear Function Approximation
The best of both worlds: stochastic and adversarial episodic MDPs with unknown transition
Finite-Sample Analysis of Off-Policy TD-Learning via Generalized Bellman Operators
Learning Adversarial Markov Decision Processes with Delayed Feedback
On the Theory of Reinforcement Learning with Once-per-Episode Feedback
Implicit Finite-Horizon Approximation for Stochastic Shortest Path
Bad-Policy Density: A Measure of Reinforcement-Learning Hardness
Gap-Dependent Unsupervised Exploration for Reinforcement Learning
Sample-Efficient Learning of Stackelberg Equilibria in General-Sum Games
Solving Multi-Arm Bandit Using a Few Bits of Communication
Provable Benefits of Actor-Critic Methods for Offline Reinforcement Learning
Successful Page Load