General Keywords

[ Algorithms ] [ Algorithms; Optimization ] [ Applications ] [ Data, Challenges, Implementations, and Software ] [ Deep Learning ] [ Deep Learning; Deep Learning ] [ Neuroscience and Cognitive Science ] [ Optimization ] [ Optimization; Optimization ] [ Probabilistic Methods ] [ Probabilistic Methods; Probabilistic Methods ] [ Reinforcement Learning and Planning ] [ Social Aspects of Machine Learning ] [ Theory ] [ Theory; Theory ]

Topic Keywords

[ Active Learning ] [ Active Learning; Algorithms ] [ Activity and Event Recognition ] [ Adaptive Data Analysis; Optimization ] [ Adversarial Examples ] [ Adversarial Learning ] [ Adversarial Learning; Algorithms ] [ Adversarial Networks ] [ Adversarial Networks ] [ Adversarial Networks; Deep Learning ] [ Adversarial Networks; Deep Learning ] [ AI Safety ] [ Algorithms Evaluation ] [ Approximate Inference ] [ Architectures ] [ Attention Models ] [ Audio and Speech Processing ] [ AutoML ] [ Bandit Algorithms ] [ Bandit Algorithms; Algorithms ] [ Bandit Algorithms; Reinforcement Learning and Planning ] [ Bandit Algorithms; Reinforcement Learning and Planning ] [ Bandits ] [ Bayesian Deep Learning ] [ Bayesian Methods ] [ Bayesian Nonparametrics ] [ Bayesian Theory ] [ Bayesian Theory ] [ Benchmarks ] [ Biologically Plausible Deep Networks ] [ Biologically Plausible Deep Networks; Deep Learning ] [ Biologically Plausible Deep Networks; Neuroscience and Cognitive Science ] [ Body Pose, Face, and Gesture Analysis ] [ Body Pose, Face, and Gesture Analysis; Applications ] [ Boosting and Ensemble Methods ] [ Boosting and Ensemble Methods; Algorithms ] [ Boosting and Ensemble Methods; Probabilistic Methods; Probabilistic Methods ] [ Causal Inference ] [ Classification ] [ Classification; Algorithms ] [ Classification; Algorithms ] [ Classification; Applications ] [ Classification; Deep Learning; Deep Learning ] [ Classification; Deep Learning; Deep Learning ] [ Clustering ] [ Clustering; Applications ] [ Clustering; Theory ] [ CNN Architectures; Deep Learning ] [ CNN Architectures; Deep Learning ] [ CNN Architectures; Theory ] [ Cognitive Science; Neuroscience and Cognitive Science ] [ Collaborative Filtering ] [ Collaborative Filtering; Algorithms ] [ Collaborative Filtering; Applications ] [ Combinatorial Optimization ] [ Components Analysis (e.g., CCA, ICA, LDA, PCA) ] [ Computational Biology and Bioinformatics ] [ Computational Biology and Bioinformatics; Applications ] [ Computational Complexity ] [ Computational Learning Theory ] [ Computational Photography ] [ Computational Social Science ] [ Computer Vision ] [ Computer Vision; Applications ] [ Computer Vision; Applications ] [ Computer Vision; Deep Learning ] [ Computer Vision; Deep Learning ] [ Computer Vision; Deep Learning ] [ Computer Vision; Deep Learning ] [ Continual Learning ] [ Convex Optimization ] [ Convex Optimization; Optimization ] [ Convex Optimization; Probabilistic Methods; Theory; Theory ] [ Convex Optimization; Theory ] [ Crowdsourcing ] [ Decision and Control ] [ Deep Autoencoders; Deep Learning ] [ Deep learning Theory ] [ Deep RL ] [ Density Estimation ] [ Density Estimation; Deep Learning ] [ Derivative Free Optimization ] [ Dialog- or Communication-Based Learning ] [ Dimensionality Reduction ] [ Distributed and Parallel Optimization ] [ Distributed Inference ] [ Efficient Inference Methods ] [ Efficient Training Methods; Deep Learning ] [ Embedding and Representation learning ] [ Embedding Approaches ] [ Exploration ] [ Fairness, Accountability, and Transparency ] [ Fairness, Accountability, and Transparency ] [ Few-Shot Learning ] [ Few-Shot Learning; Algorithms ] [ Frequentist Statistics ] [ Game Theory and Computational Economics ] [ Gaussian Processes ] [ Gaussian Processes and Bayesian non-parametrics ] [ Generative Models ] [ Generative Models ] [ Graphical Models ] [ Graphical Models ] [ Hardware and Systems ] [ Healthcare ] [ Human or Animal Learning ] [ Human or Animal Learning; Probabilistic Methods ] [ Image Segmentation ] [ Image Segmentation; Algorithms ] [ Image Segmentation; Applications ] [ Information Theory ] [ Kernel Methods ] [ Kernel Methods; Optimization ] [ Large Deviations and Asymptotic Analysis ] [ Large Scale Learning ] [ Large Scale Learning; Algorithms ] [ Large Scale Learning; Algorithms ] [ Large Scale Learning; Applications ] [ Large Scale Learning; Deep Learning ] [ Large Scale Learning; Probabilistic Methods ] [ Latent Variable Models ] [ Learning Theory ] [ Markov Decision Processes ] [ Markov Decision Processes; Reinforcement Learning and Planning ] [ Markov Decision Processes; Reinforcement Learning and Planning ] [ Matrix and Tensor Factorization ] [ MCMC ] [ Memory ] [ Memory; Optimization ] [ Meta-Learning ] [ Meta-Learning; Applications ] [ Metric Learning ] [ Missing Data; Algorithms ] [ Missing Data; Algorithms ] [ Missing Data; Theory ] [ Model Selection and Structure Learning ] [ Models of Learning and Generalization ] [ Monte Carlo Methods ] [ Multi-Agent RL ] [ Multimodal Learning ] [ Multitask and Transfer Learning ] [ Multitask and Transfer Learning; Algorithms ] [ Multitask and Transfer Learning; Probabilistic Methods ] [ Multitask, Transfer, and Meta Learning ] [ Natural Language Processing ] [ Network Analysis ] [ Networks and Relational Learning ] [ Neural Coding; Neuroscience and Cognitive Science ] [ Neuroscience ] [ Neuroscience and Cognitive Science ] [ Non-Convex Optimization ] [ Non-Convex Optimization ] [ Non-Convex Optimization; Theory ] [ Non-parametric models ] [ Object Detection; Deep Learning ] [ Object Detection; Neuroscience and Cognitive Science ] [ Online Learning ] [ Online Learning Algorithms ] [ Online Learning Theory ] [ Online Learning; Theory ] [ Optimal Transport ] [ Optimization for Deep Networks ] [ Others ] [ Others ] [ Others ] [ Others ] [ Others ] [ Planning and Control ] [ Plasticity and Adaptation ] [ Predictive Models ] [ Predictive Models; Deep Learning ] [ Predictive Models; Deep Learning ] [ Privacy, Anonymity, and Security ] [ Privacy, Anonymity, and Security ] [ Probabilistic Methods ] [ Probabilistic Programming ] [ Program Understanding and Generation ] [ Quantitative Finance and Econometrics ] [ Ranking and Preference Learning ] [ Ranking and Preference Learning; Theory ] [ Reasoning; Optimization ] [ Recommender Systems ] [ Recurrent Networks ] [ Recurrent Networks; Theory ] [ Regression ] [ Regression; Algorithms ] [ Regression; Applications ] [ Regression; Optimization ] [ Regression; Probabilistic Methods; Probabilistic Methods ] [ Regularization ] [ Regularization ] [ Reinforcement Learning ] [ Reinforcement Learning and Planning ] [ Relational Learning ] [ Representation Learning ] [ Representation Learning; Algorithms ] [ Representation Learning; Algorithms ] [ Representation Learning; Neuroscience and Cognitive Science ] [ Representation Learning; Neuroscience and Cognitive Science; Neuroscience and Cognitive Science ] [ Representation Learning; Optimization ] [ RL, Decisions and Control Theory ] [ Robotics ] [ Robust statistics ] [ Semi-Supervised Learning ] [ Social Aspects of Machine Learning ] [ Software Toolkits ] [ Spaces of Functions and Kernels ] [ Sparse Coding and Dimensionality Expansion; Applications ] [ Sparsity and Compressed Sensing ] [ Sparsity and Compressed Sensing; Applications ] [ Sparsity and Compressed Sensing; Optimization; Theory ] [ Speech Recognition ] [ Statistical Learning Theory ] [ Statistical Physics of Learning ] [ Stochastic Optimization ] [ Structured Prediction ] [ Submodular Optimization ] [ Supervised Learning ] [ Sustainability and Environment ] [ Theory ] [ Time Series Analysis ] [ Time Series Analysis; Deep Learning ] [ Time Series Analysis; Probabilistic Methods; Probabilistic Methods ] [ Time Series and Sequences ] [ Topic Models ] [ Uncertainty Estimation ] [ Uncertainty Estimation; Applications; Probabilistic Methods ] [ Unsupervised Learning ] [ Unsupervised Learning; Applications ] [ Unsupervised Learning; Deep Learning ] [ Variational Inference ] [ Visualization or Exposition Techniques for Deep Networks ] [ Visual Question Answering ] [ Visual Scene Analysis and Interpretation ]

397 Results

Oral
Tue 5:00 Phasic Policy Gradient
Karl Cobbe, Jacob Hilton, Oleg Klimov, John Schulman
Oral
Tue 5:00 Scalable Evaluation of Multi-Agent Reinforcement Learning with Melting Pot
Joel Z Leibo, Edgar Duenez-Guzman, Sasha Vezhnevets, John Agapiou, Peter Sunehag, Raphael Koster, Jayd Matyas, Charlie Beattie, Igor Mordatch, Thore Graepel
Oral
Tue 5:00 Deeply-Debiased Off-Policy Interval Estimation
Chengchun Shi, Runzhe Wan, Victor Chernozhukov, Rui Song
Spotlight
Tue 5:20 Reinforcement Learning with Prototypical Representations
Denis Yarats, Rob Fergus, Alessandro Lazaric, Lerrel Pinto
Spotlight
Tue 5:20 UneVEn: Universal Value Exploration for Multi-Agent Reinforcement Learning
Tarun Gupta, Anuj Mahajan, Bei Peng, Wendelin Boehmer, Shimon Whiteson
Spotlight
Tue 5:20 Offline Contextual Bandits with Overparameterized Models
David Brandfonbrener, Will Whitney, Rajesh Ranganath, Joan Bruna
Spotlight
Tue 5:25 Demonstration-Conditioned Reinforcement Learning for Few-Shot Imitation
Christopher Dance, Perez Julien, Théo Cachet
Spotlight
Tue 5:25 A Policy Gradient Algorithm for Learning to Learn in Multiagent Reinforcement Learning
Dong Ki Kim, Miao Liu, Matthew Riemer, Chuangchuang Sun, Marwa Abdulhai, Golnaz Habibi, Sebastian Lopez-Cot, Gerald Tesauro, Jonathan How
Spotlight
Tue 5:25 Diversity Actor-Critic: Sample-Aware Entropy Regularization for Sample-Efficient Exploration
Seungyul Han, Youngchul Sung
Spotlight
Tue 5:30 Bias-Robust Bayesian Optimization via Dueling Bandits
Johannes Kirschner, Andreas Krause
Spotlight
Tue 5:30 Muesli: Combining Improvements in Policy Optimization
Matteo Hessel, Ivo Danihelka, Fabio Viola, Arthur Guez, Simon Schmitt, Laurent Sifre, Theo Weber, David Silver, Hado van Hasselt
Spotlight
Tue 5:30 A New Representation of Successor Features for Transfer across Dissimilar Environments
Majid Abdolshah, Hung Le, Thommen Karimpanal George, Sunil Gupta, Santu Rana, Svetha Venkatesh
Spotlight
Tue 5:30 Multi-Agent Training beyond Zero-Sum with Correlated Equilibrium Meta-Solvers
Luke Marris, Paul Muller, Marc Lanctot, Karl Tuyls, Thore Graepel
Spotlight
Tue 5:35 Preferential Temporal Difference Learning
Nishanth Anand, Doina Precup
Spotlight
Tue 5:35 PC-MLP: Model-based Reinforcement Learning with Policy Cover Guided Exploration
Yuda Song, Wen Sun
Spotlight
Tue 5:35 Unsupervised Learning of Visual 3D Keypoints for Control
Boyuan Chen, Pieter Abbeel, Deepak Pathak
Spotlight
Tue 5:40 On the Optimality of Batch Policy Optimization Algorithms
Chenjun Xiao, Yifan Wu, Jincheng Mei, Bo Dai, Tor Lattimore, Lihong Li, Csaba Szepesvari, Dale Schuurmans
Spotlight
Tue 5:40 Learning Task Informed Abstractions
Xiang Fu, Ge Yang, Pulkit Agrawal, Tommi Jaakkola
Spotlight
Tue 5:40 Imitation by Predicting Observations
Andrew Jaegle, Yury Sulsky, Arun Ahuja, Jake Bruce, Rob Fergus, Greg Wayne
Spotlight
Tue 5:45 State Entropy Maximization with Random Encoders for Efficient Exploration
Younggyo Seo, Lili Chen, Jinwoo Shin, Honglak Lee, Pieter Abbeel, Kimin Lee
Oral
Tue 6:00 Risk-Sensitive Reinforcement Learning with Function Approximation: A Debiasing Approach
Tom Fei, Zhuoran Yang, Zhaoran Wang
Spotlight
Tue 6:20 Model-Free Reinforcement Learning: from Clipped Pseudo-Regret to Sample Complexity
Zhang Zihan, Yuan Zhou, Xiangyang Ji
Spotlight
Tue 6:25 Neuro-algorithmic Policies Enable Fast Combinatorial Generalization
Marin Vlastelica, Michal Rolinek, Georg Martius
Spotlight
Tue 6:30 PID Accelerated Value Iteration Algorithm
Amir-massoud Farahmand, Mohammad Ghavamzadeh
Spotlight
Tue 6:35 Provably Efficient Learning of Transferable Rewards
Alberto Maria Metelli, Giorgia Ramponi, Alessandro Concetti, Marcello Restelli
Spotlight
Tue 6:40 Bayesian Deep Learning via Subnetwork Inference
Erik Daxberger, Eric Nalisnick, James Allingham, Javier Antorán, Jose Miguel Hernandez-Lobato
Spotlight
Tue 6:40 Reinforcement Learning for Cost-Aware Markov Decision Processes
Wesley A Suttle, Kaiqing Zhang, Zhuoran Yang, Ji Liu, David N Kraemer
Oral Session
Tue 7:00 Reinforcement Learning and Planning 1
Oral Session
Tue 7:00 Reinforcement Learning and Planning 2
Oral
Tue 7:00 World Model as a Graph: Learning Latent Landmarks for Planning
Lunjun Zhang, Ge Yang, Bradly Stadie
Oral
Tue 7:00 Skill Discovery for Exploration and Planning using Deep Skill Graphs
Akhil Bagaria, Jason Senthil, George Konidaris
Oral
Tue 7:00 Coach-Player Multi-agent Reinforcement Learning for Dynamic Team Composition
Bo Liu, Qiang Liu, Peter Stone, Animesh Garg, Yuke Zhu, Anima Anandkumar
Spotlight
Tue 7:20 Revisiting Rainbow: Promoting more insightful and inclusive deep reinforcement learning research
Johan Obando Ceron, Pablo Samuel Castro
Spotlight
Tue 7:20 Learning Routines for Effective Off-Policy Reinforcement Learning
Edoardo Cetin, Oya Celiktutan
Spotlight
Tue 7:20 Tesseract: Tensorised Actors for Multi-Agent Reinforcement Learning
Anuj Mahajan, Mikayel Samvelyan, Lei Mao, Viktor Makoviychuk, Animesh Garg, Jean Kossaifi, Shimon Whiteson, Yuke Zhu, Anima Anandkumar
Spotlight
Tue 7:25 Deep Reinforcement Learning amidst Continual Structured Non-Stationarity
Annie Xie, James Harrison, Chelsea Finn
Spotlight
Tue 7:25 PODS: Policy Optimization via Differentiable Simulation
Miguel Angel Zamora Mora, Momchil Peychev, Sehoon Ha, Martin Vechev, Stelian Coros
Spotlight
Tue 7:25 A New Formalism, Method and Open Issues for Zero-Shot Coordination
Johannes Treutlein, Michael Dennis, Caspar Oesterheld, Jakob Foerster
Spotlight
Tue 7:30 Learning and Planning in Complex Action Spaces
Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Amin Barekatain, Simon Schmitt, David Silver
Spotlight
Tue 7:30 Offline Reinforcement Learning with Pseudometric Learning
Robert Dadashi, Shideh Rezaeifar, Nino Vieillard, Léonard Hussenot, Olivier Pietquin, Matthieu Geist
Spotlight
Tue 7:30 Targeted Data Acquisition for Evolving Negotiation Agents
Minae Kwon, Sidd Karamcheti, Mariano-Florentino Cuellar, Dorsa Sadigh
Spotlight
Tue 7:35 Model-Based Reinforcement Learning via Latent-Space Collocation
Oleg Rybkin, Chuning Zhu, Anusha Nagabandi, Kostas Daniilidis, Igor Mordatch, Sergey Levine
Spotlight
Tue 7:35 Inverse Constrained Reinforcement Learning
Shehryar Malik, Usman Anwar, Alireza Aghasi, Ali Ahmed
Spotlight
Tue 7:35 EMaQ: Expected-Max Q-Learning Operator for Simple Yet Effective Offline and Online RL
Seyed Kamyar Seyed Ghasemipour, Dale Schuurmans, Shixiang Gu
Spotlight
Tue 7:40 Vector Quantized Models for Planning
Sherjil Ozair, Yazhe Li, Ali Razavi, Ioannis Antonoglou, Aäron van den Oord, Oriol Vinyals
Spotlight
Tue 7:40 Counterfactual Credit Assignment in Model-Free Reinforcement Learning
Thomas Mesnard, Theo Weber, Fabio Viola, Shantanu Thakoor, Alaa Saade, Anna Harutyunyan, Will Dabney, Thomas Stepleton, Nicolas Heess, Arthur Guez, Eric Moulines, Marcus Hutter, Lars Buesing, Remi Munos
Spotlight
Tue 7:45 Improved Denoising Diffusion Probabilistic Models
Alexander Nichol, Prafulla Dhariwal
Spotlight
Tue 7:45 Interactive Learning from Activity Description
Khanh Nguyen, Dipendra Misra, Robert Schapire, Miro Dudik, Patrick Shafto
Spotlight
Tue 7:45 LTL2Action: Generalizing LTL Instructions for Multi-Task RL
Pashootan Vaezipoor, Andrew C Li, Rodrigo A Toro Icarte, Sheila McIlraith
Poster
Tue 9:00 Deep Reinforcement Learning amidst Continual Structured Non-Stationarity
Annie Xie, James Harrison, Chelsea Finn
Poster
Tue 9:00 World Model as a Graph: Learning Latent Landmarks for Planning
Lunjun Zhang, Ge Yang, Bradly Stadie
Poster
Tue 9:00 LTL2Action: Generalizing LTL Instructions for Multi-Task RL
Pashootan Vaezipoor, Andrew C Li, Rodrigo A Toro Icarte, Sheila McIlraith
Poster
Tue 9:00 Demonstration-Conditioned Reinforcement Learning for Few-Shot Imitation
Christopher Dance, Perez Julien, Théo Cachet
Poster
Tue 9:00 Offline Contextual Bandits with Overparameterized Models
David Brandfonbrener, Will Whitney, Rajesh Ranganath, Joan Bruna
Poster
Tue 9:00 Multi-Agent Training beyond Zero-Sum with Correlated Equilibrium Meta-Solvers
Luke Marris, Paul Muller, Marc Lanctot, Karl Tuyls, Thore Graepel
Poster
Tue 9:00 A New Formalism, Method and Open Issues for Zero-Shot Coordination
Johannes Treutlein, Michael Dennis, Caspar Oesterheld, Jakob Foerster
Poster
Tue 9:00 PODS: Policy Optimization via Differentiable Simulation
Miguel Angel Zamora Mora, Momchil Peychev, Sehoon Ha, Martin Vechev, Stelian Coros
Poster
Tue 9:00 Scalable Evaluation of Multi-Agent Reinforcement Learning with Melting Pot
Joel Z Leibo, Edgar Duenez-Guzman, Sasha Vezhnevets, John Agapiou, Peter Sunehag, Raphael Koster, Jayd Matyas, Charlie Beattie, Igor Mordatch, Thore Graepel
Poster
Tue 9:00 Bias-Robust Bayesian Optimization via Dueling Bandits
Johannes Kirschner, Andreas Krause
Poster
Tue 9:00 Bayesian Deep Learning via Subnetwork Inference
Erik Daxberger, Eric Nalisnick, James Allingham, Javier Antorán, Jose Miguel Hernandez-Lobato
Poster
Tue 9:00 Deeply-Debiased Off-Policy Interval Estimation
Chengchun Shi, Runzhe Wan, Victor Chernozhukov, Rui Song
Poster
Tue 9:00 State Entropy Maximization with Random Encoders for Efficient Exploration
Younggyo Seo, Lili Chen, Jinwoo Shin, Honglak Lee, Pieter Abbeel, Kimin Lee
Poster
Tue 9:00 Counterfactual Credit Assignment in Model-Free Reinforcement Learning
Thomas Mesnard, Theo Weber, Fabio Viola, Shantanu Thakoor, Alaa Saade, Anna Harutyunyan, Will Dabney, Thomas Stepleton, Nicolas Heess, Arthur Guez, Eric Moulines, Marcus Hutter, Lars Buesing, Remi Munos
Poster
Tue 9:00 Interactive Learning from Activity Description
Khanh Nguyen, Dipendra Misra, Robert Schapire, Miro Dudik, Patrick Shafto
Poster
Tue 9:00 On the Optimality of Batch Policy Optimization Algorithms
Chenjun Xiao, Yifan Wu, Jincheng Mei, Bo Dai, Tor Lattimore, Lihong Li, Csaba Szepesvari, Dale Schuurmans
Poster
Tue 9:00 Coach-Player Multi-agent Reinforcement Learning for Dynamic Team Composition
Bo Liu, Qiang Liu, Peter Stone, Animesh Garg, Yuke Zhu, Anima Anandkumar
Poster
Tue 9:00 Unsupervised Learning of Visual 3D Keypoints for Control
Boyuan Chen, Pieter Abbeel, Deepak Pathak
Poster
Tue 9:00 Provably Efficient Learning of Transferable Rewards
Alberto Maria Metelli, Giorgia Ramponi, Alessandro Concetti, Marcello Restelli
Poster
Tue 9:00 PC-MLP: Model-based Reinforcement Learning with Policy Cover Guided Exploration
Yuda Song, Wen Sun
Poster
Tue 9:00 Inverse Constrained Reinforcement Learning
Shehryar Malik, Usman Anwar, Alireza Aghasi, Ali Ahmed
Poster
Tue 9:00 EMaQ: Expected-Max Q-Learning Operator for Simple Yet Effective Offline and Online RL
Seyed Kamyar Seyed Ghasemipour, Dale Schuurmans, Shixiang Gu
Poster
Tue 9:00 Phasic Policy Gradient
Karl Cobbe, Jacob Hilton, Oleg Klimov, John Schulman
Poster
Tue 9:00 Learning Task Informed Abstractions
Xiang Fu, Ge Yang, Pulkit Agrawal, Tommi Jaakkola
Poster
Tue 9:00 Skill Discovery for Exploration and Planning using Deep Skill Graphs
Akhil Bagaria, Jason Senthil, George Konidaris
Poster
Tue 9:00 Preferential Temporal Difference Learning
Nishanth Anand, Doina Precup
Poster
Tue 9:00 Offline Reinforcement Learning with Pseudometric Learning
Robert Dadashi, Shideh Rezaeifar, Nino Vieillard, Léonard Hussenot, Olivier Pietquin, Matthieu Geist
Poster
Tue 9:00 Imitation by Predicting Observations
Andrew Jaegle, Yury Sulsky, Arun Ahuja, Jake Bruce, Rob Fergus, Greg Wayne
Poster
Tue 9:00 Diversity Actor-Critic: Sample-Aware Entropy Regularization for Sample-Efficient Exploration
Seungyul Han, Youngchul Sung
Poster
Tue 9:00 Model-Based Reinforcement Learning via Latent-Space Collocation
Oleg Rybkin, Chuning Zhu, Anusha Nagabandi, Kostas Daniilidis, Igor Mordatch, Sergey Levine
Poster
Tue 9:00 Tesseract: Tensorised Actors for Multi-Agent Reinforcement Learning
Anuj Mahajan, Mikayel Samvelyan, Lei Mao, Viktor Makoviychuk, Animesh Garg, Jean Kossaifi, Shimon Whiteson, Yuke Zhu, Anima Anandkumar
Poster
Tue 9:00 Learning and Planning in Complex Action Spaces
Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Amin Barekatain, Simon Schmitt, David Silver
Poster
Tue 9:00 PID Accelerated Value Iteration Algorithm
Amir-massoud Farahmand, Mohammad Ghavamzadeh
Poster
Tue 9:00 Reinforcement Learning with Prototypical Representations
Denis Yarats, Rob Fergus, Alessandro Lazaric, Lerrel Pinto
Poster
Tue 9:00 A Policy Gradient Algorithm for Learning to Learn in Multiagent Reinforcement Learning
Dong Ki Kim, Miao Liu, Matthew Riemer, Chuangchuang Sun, Marwa Abdulhai, Golnaz Habibi, Sebastian Lopez-Cot, Gerald Tesauro, Jonathan How
Poster
Tue 9:00 Model-Free Reinforcement Learning: from Clipped Pseudo-Regret to Sample Complexity
Zhang Zihan, Yuan Zhou, Xiangyang Ji
Poster
Tue 9:00 Neuro-algorithmic Policies Enable Fast Combinatorial Generalization
Marin Vlastelica, Michal Rolinek, Georg Martius
Poster
Tue 9:00 A New Representation of Successor Features for Transfer across Dissimilar Environments
Majid Abdolshah, Hung Le, Thommen Karimpanal George, Sunil Gupta, Santu Rana, Svetha Venkatesh
Poster
Tue 9:00 Learning Routines for Effective Off-Policy Reinforcement Learning
Edoardo Cetin, Oya Celiktutan
Poster
Tue 9:00 Decoupling Value and Policy for Generalization in Reinforcement Learning
Roberta Raileanu, Rob Fergus
Poster
Tue 9:00 Muesli: Combining Improvements in Policy Optimization
Matteo Hessel, Ivo Danihelka, Fabio Viola, Arthur Guez, Simon Schmitt, Laurent Sifre, Theo Weber, David Silver, Hado van Hasselt
Poster
Tue 9:00 Risk-Sensitive Reinforcement Learning with Function Approximation: A Debiasing Approach
Tom Fei, Zhuoran Yang, Zhaoran Wang
Poster
Tue 9:00 Revisiting Rainbow: Promoting more insightful and inclusive deep reinforcement learning research
Johan Obando Ceron, Pablo Samuel Castro
Poster
Tue 9:00 Targeted Data Acquisition for Evolving Negotiation Agents
Minae Kwon, Sidd Karamcheti, Mariano-Florentino Cuellar, Dorsa Sadigh
Poster
Tue 9:00 UneVEn: Universal Value Exploration for Multi-Agent Reinforcement Learning
Tarun Gupta, Anuj Mahajan, Bei Peng, Wendelin Boehmer, Shimon Whiteson
Poster
Tue 9:00 Vector Quantized Models for Planning
Sherjil Ozair, Yazhe Li, Ali Razavi, Ioannis Antonoglou, Aäron van den Oord, Oriol Vinyals
Poster
Tue 9:00 Improved Denoising Diffusion Probabilistic Models
Alexander Nichol, Prafulla Dhariwal
Poster
Tue 9:00 Reinforcement Learning for Cost-Aware Markov Decision Processes
Wesley A Suttle, Kaiqing Zhang, Zhuoran Yang, Ji Liu, David N Kraemer
Oral
Tue 17:00 Randomized Entity-wise Factorization for Multi-Agent Reinforcement Learning
Shariq Iqbal, Christian Schroeder, Bei Peng, Wendelin Boehmer, Shimon Whiteson, Fei Sha
Oral
Tue 17:00 PsiPhi-Learning: Reinforcement Learning with Demonstrations using Successor Features and Inverse Temporal Difference Learning
Angelos Filos, Clare Lyle, Yarin Gal, Sergey Levine, Natasha Jaques, Gregory Farquhar
Oral Session
Tue 17:00 Reinforcement Learning and Planning 3
Oral
Tue 17:00 Robust Asymmetric Learning in POMDPs
Andrew Warrington, Jonathan Lavington, Adam Scibior, Mark Schmidt, Frank Wood
Spotlight
Tue 17:20 Differentiable Spatial Planning using Transformers
Devendra Singh Chaplot, Deepak Pathak, Jitendra Malik
Spotlight
Tue 17:20 Safe Reinforcement Learning with Linear Function Approximation
Sanae Amani, Christos Thrampoulidis, Lin Yang
Spotlight
Tue 17:25 Convex Regularization in Monte-Carlo Tree Search
Tuan Q Dam, Carlo D'Eramo, Jan Peters, Joni Pajarinen
Spotlight
Tue 17:25 Adapting to Delays and Data in Adversarial Multi-Armed Bandits
András György, Pooria Joulani
Spotlight
Tue 17:25 Emergent Social Learning via Multi-agent Reinforcement Learning
Kamal Ndousse, Douglas Eck, Sergey Levine, Natasha Jaques
Spotlight
Tue 17:25 Shortest-Path Constrained Reinforcement Learning for Sparse Reward Tasks
Sungryull Sohn, Sungtae Lee, Jongwook Choi, Harm van Seijen, Mehdi Fatemi, Honglak Lee
Spotlight
Tue 17:30 From Poincaré Recurrence to Convergence in Imperfect Information Games: Finding Equilibrium via Regularization
Julien Perolat, Remi Munos, Jean-Baptiste Lespiau, Shayegan Omidshafiei, Mark Rowland, Pedro Ortega, Neil Burch, Thomas Anthony, David Balduzzi, Bart De Vylder, Georgios Piliouras, Marc Lanctot, Karl Tuyls
Spotlight
Tue 17:30 Offline Reinforcement Learning with Fisher Divergence Critic Regularization
Ilya Kostrikov, Rob Fergus, Jonathan Tompson, Ofir Nachum
Spotlight
Tue 17:30 On-Policy Deep Reinforcement Learning for the Average-Reward Criterion
Yiming Zhang, Keith Ross
Spotlight
Tue 17:35 Recomposing the Reinforcement Learning Building Blocks with Hypernetworks
Elad Sarafian, Shai Keynan, Sarit Kraus
Spotlight
Tue 17:35 Multi-Task Reinforcement Learning with Context-based Representations
Shagun Sodhani, Amy Zhang, Joelle Pineau
Spotlight
Tue 17:40 OptiDICE: Offline Policy Optimization via Stationary Distribution Correction Estimation
Jongmin Lee, Wonseok Jeon, Byung-Jun Lee, Joelle Pineau, Kee-Eung Kim
Spotlight
Tue 17:40 High Confidence Generalization for Reinforcement Learning
James Kostas, Yash Chandak, Scott Jordan, Georgios Theocharous, Philip Thomas
Spotlight
Tue 17:40 Trajectory Diversity for Zero-Shot Coordination
Andrei Lupu, Brandon Cui, Hengyuan Hu, Jakob Foerster
Spotlight
Tue 17:45 Locally Persistent Exploration in Continuous Control Tasks with Sparse Rewards
Susan Amin, Maziar Gomrokchi, Hossein Aboutalebi, Harsh Satija, Doina Precup
Spotlight
Tue 17:45 FOP: Factorizing Optimal Joint Policy of Maximum-Entropy Multi-Agent Reinforcement Learning
Tianhao Zhang, 岳珩 李, Chen Wang, Guangming Xie, Zongqing Lu
Spotlight
Tue 17:45 Discovering symbolic policies with deep reinforcement learning
Mikel Landajuela Larma, Brenden Petersen, Sookyung Kim, Claudio Santiago, Ruben Glatt, Nathan Mundhenk, Jacob Pettit, Daniel Faissol
Oral
Tue 18:00 The Emergence of Individuality
Jiechuan Jiang, Zongqing Lu
Oral
Tue 18:00 Decoupling Value and Policy for Generalization in Reinforcement Learning
Roberta Raileanu, Rob Fergus
Oral
Tue 18:00 PEBBLE: Feedback-Efficient Interactive Reinforcement Learning via Relabeling Experience and Unsupervised Pre-training
Kimin Lee, Laura Smith, Pieter Abbeel
Spotlight
Tue 18:20 DFAC Framework: Factorizing the Value Function via Quantile Mixture for Multi-Agent Distributional Q-Learning
Wei-Fang Sun, Cheng-Kuang Lee, Chun-Yi Lee
Spotlight
Tue 18:20 Prioritized Level Replay
Minqi Jiang, Edward Grefenstette, Tim Rocktäschel
Spotlight
Tue 18:20 Uncertainty Weighted Actor-Critic for Offline Reinforcement Learning
Yue Wu, Shuangfei Zhai, Nitish Srivastava, Josh M Susskind, Jian Zhang, Russ Salakhutdinov, Hanlin Goh
Spotlight
Tue 18:25 Keyframe-Focused Visual Imitation Learning
Chuan Wen, Jierui Lin, Jianing Qian, Yang Gao, Dinesh Jayaraman
Spotlight
Tue 18:25 SECANT: Self-Expert Cloning for Zero-Shot Generalization of Visual Policies
Jim Fan, Guanzhi Wang, De-An Huang, Zhiding Yu, Li Fei-Fei, Yuke Zhu, Anima Anandkumar
Spotlight
Tue 18:25 From Local to Global Norm Emergence: Dissolving Self-reinforcing Substructures with Incremental Social Instruments
Yiwei Liu, Jiamou Liu, Kaibin Wan, Zhan Qin, Zijian Zhang, Bakhadyr Khoussainov, Liehuang Zhu
Spotlight
Tue 18:30 Learning and Planning in Average-Reward Markov Decision Processes
Yi Wan, Abhishek Naik, Richard Sutton
Spotlight
Tue 18:30 Learning While Playing in Mean-Field Games: Convergence and Optimality
Qiaomin Xie, Zhuoran Yang, Zhaoran Wang, Andreea Minca
Spotlight
Tue 18:30 GMAC: A Distributional Perspective on Actor-Critic Framework
Daniel Nam, Younghoon Kim, Chan Park
Spotlight
Tue 18:35 Goal-Conditioned Reinforcement Learning with Imagined Subgoals
Elliot Chane-Sane, Cordelia Schmid, Ivan Laptev
Spotlight
Tue 18:35 Learning Fair Policies in Decentralized Cooperative Multi-Agent Reinforcement Learning
Matthieu Zimmer, Claire Glanois, Umer Siddique, Paul Weng
Spotlight
Tue 18:35 Towards Better Laplacian Representation in Reinforcement Learning with Generalized Graph Drawing
Kaixin Wang, Kuangqi Zhou, Qixin Zhang, Jie Shao, Bryan Hooi, Jiashi Feng
Spotlight
Tue 18:40 Low-Precision Reinforcement Learning: Running Soft Actor-Critic in Half Precision
Johan Björck, Xiangyu Chen, Christopher De Sa, Carla Gomes, Kilian Weinberger
Spotlight
Tue 18:40 Probabilistic Sequential Shrinking: A Best Arm Identification Algorithm for Stochastic Bandits with Corruptions
Zixin Zhong, Wang Chi Cheung, Vincent Tan
Spotlight
Tue 18:40 Augmented World Models Facilitate Zero-Shot Dynamics Generalization From a Single Offline Environment
Philip Ball, Cong Lu, Jack Parker-Holder, Stephen Roberts
Spotlight
Tue 18:45 Emphatic Algorithms for Deep Reinforcement Learning
Ray Jiang, Tom Zahavy, Zhongwen Xu, Adam White, Matteo Hessel, Charles Blundell, Hado van Hasselt
Spotlight
Tue 18:45 Reinforcement Learning of Implicit and Explicit Control Flow Instructions
Ethan Brooks, Janarthanan Rajendran, Richard Lewis, Satinder Singh
Oral
Tue 19:00 Inverse Decision Modeling: Learning Interpretable Representations of Behavior
Daniel Jarrett, Alihan Hüyük, Mihaela van der Schaar
Oral
Tue 19:00 Cooperative Exploration for Multi-Agent Deep Reinforcement Learning
Iou-Jen Liu, Unnat Jain, Raymond Yeh, Alex Schwing
Oral
Tue 19:00 Hyperparameter Selection for Imitation Learning
Léonard Hussenot, Marcin Andrychowicz, Damien Vincent, Robert Dadashi, Anton Raichuk, Sabela Ramos, Nikola Momchev, Sertan Girgin, Raphael Marinier, Lukasz Stafiniak, Emmanuel Orsini, Olivier Bachem, Matthieu Geist, Olivier Pietquin
Spotlight
Tue 19:20 On Proximal Policy Optimization's Heavy-tailed Gradients
Saurabh Garg, Joshua Zhanson, Emilio Parisotto, Adarsh Prasad, Zico Kolter, Zachary Lipton, Sivaraman Balakrishnan, Russ Salakhutdinov, Pradeep Ravikumar
Spotlight
Tue 19:20 A Deep Reinforcement Learning Approach to Marginalized Importance Sampling with the Successor Representation
Scott Fujimoto, David Meger, Doina Precup
Spotlight
Tue 19:20 Revisiting Peng's Q($\lambda$) for Modern Reinforcement Learning
Tadashi Kozuno, Yunhao Tang, Mark Rowland, Remi Munos, Steven Kapturowski, Will Dabney, Michal Valko, Dave Abel
Spotlight
Tue 19:25 Policy Information Capacity: Information-Theoretic Measure for Task Complexity in Deep Reinforcement Learning
Hiroki Furuta, Tatsuya Matsushima, Tadashi Kozuno, Yutaka Matsuo, Sergey Levine, Ofir Nachum, Shixiang Gu
Spotlight
Tue 19:25 Learning to Weight Imperfect Demonstrations
Yunke Wang, Chang Xu, Bo Du, Honglak Lee
Spotlight
Tue 19:25 Monotonic Robust Policy Optimization with Model Discrepancy
yuankun jiang, Chenglin Li, Wenrui Dai, Junni Zou, Hongkai Xiong
Spotlight
Tue 19:30 Variational Empowerment as Representation Learning for Goal-Conditioned Reinforcement Learning
Jongwook Choi, Archit Sharma, Honglak Lee, Sergey Levine, Shixiang Gu
Spotlight
Tue 19:30 DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning
Daochen Zha, Jingru Xie, Wenye Ma, Sheng Zhang, Xiangru Lian, Xia Hu, Ji Liu
Spotlight
Tue 19:30 Taylor Expansion of Discount Factors
Yunhao Tang, Mark Rowland, Remi Munos, Michal Valko
Spotlight
Tue 19:35 MURAL: Meta-Learning Uncertainty-Aware Rewards for Outcome-Driven Reinforcement Learning
Kevin Li, Abhishek Gupta, Ashwin D Reddy, Vitchyr Pong, Aurick Zhou, Justin Yu, Sergey Levine
Spotlight
Tue 19:35 Is Pessimism Provably Efficient for Offline RL?
Ying Jin, Zhuoran Yang, Zhaoran Wang
Spotlight
Tue 19:35 Generalizable Episodic Memory for Deep Reinforcement Learning
Hao Hu, Jianing Ye, Guangxiang Zhu, Zhizhou Ren, Chongjie Zhang
Spotlight
Tue 19:40 Beyond Variance Reduction: Understanding the True Impact of Baselines on Policy Optimization
Wes Chung, Valentin Thomas, Marlos C. Machado, Nicolas Le Roux
Spotlight
Tue 19:40 RRL: Resnet as representation for Reinforcement Learning
Rutav Shah, Vikash Kumar
Spotlight
Tue 19:40 Representation Matters: Offline Pretraining for Sequential Decision Making
Mengjiao Yang, Ofir Nachum
Spotlight
Tue 19:45 SCC: an efficient deep reinforcement learning agent mastering the game of StarCraft II
Xiangjun Wang, Junxiao SONG, Penghui Qi, Peng Peng, Zhenkun Tang, Wei Zhang, Weimin Li, Xiongjun Pi, Jujie He, Chao Gao, Haitao Long, Quan Yuan
Spotlight
Tue 19:45 Density Constrained Reinforcement Learning
Zengyi Qin, Yuxiao Chen, Chuchu Fan
Poster
Tue 21:00 Augmented World Models Facilitate Zero-Shot Dynamics Generalization From a Single Offline Environment
Philip Ball, Cong Lu, Jack Parker-Holder, Stephen Roberts
Poster
Tue 21:00 Learning While Playing in Mean-Field Games: Convergence and Optimality
Qiaomin Xie, Zhuoran Yang, Zhaoran Wang, Andreea Minca
Poster
Tue 21:00 Robust Asymmetric Learning in POMDPs
Andrew Warrington, Jonathan Lavington, Adam Scibior, Mark Schmidt, Frank Wood
Poster
Tue 21:00 On-Policy Deep Reinforcement Learning for the Average-Reward Criterion
Yiming Zhang, Keith Ross
Poster
Tue 21:00 High Confidence Generalization for Reinforcement Learning
James Kostas, Yash Chandak, Scott Jordan, Georgios Theocharous, Philip Thomas
Poster
Tue 21:00 Trajectory Diversity for Zero-Shot Coordination
Andrei Lupu, Brandon Cui, Hengyuan Hu, Jakob Foerster
Poster
Tue 21:00 Hyperparameter Selection for Imitation Learning
Léonard Hussenot, Marcin Andrychowicz, Damien Vincent, Robert Dadashi, Anton Raichuk, Sabela Ramos, Nikola Momchev, Sertan Girgin, Raphael Marinier, Lukasz Stafiniak, Emmanuel Orsini, Olivier Bachem, Matthieu Geist, Olivier Pietquin
Poster
Tue 21:00 Monotonic Robust Policy Optimization with Model Discrepancy
yuankun jiang, Chenglin Li, Wenrui Dai, Junni Zou, Hongkai Xiong
Poster
Tue 21:00 Density Constrained Reinforcement Learning
Zengyi Qin, Yuxiao Chen, Chuchu Fan
Poster
Tue 21:00 Offline Reinforcement Learning with Fisher Divergence Critic Regularization
Ilya Kostrikov, Rob Fergus, Jonathan Tompson, Ofir Nachum
Poster
Tue 21:00 SECANT: Self-Expert Cloning for Zero-Shot Generalization of Visual Policies
Jim Fan, Guanzhi Wang, De-An Huang, Zhiding Yu, Li Fei-Fei, Yuke Zhu, Anima Anandkumar
Poster
Tue 21:00 OptiDICE: Offline Policy Optimization via Stationary Distribution Correction Estimation
Jongmin Lee, Wonseok Jeon, Byung-Jun Lee, Joelle Pineau, Kee-Eung Kim
Poster
Tue 21:00 PEBBLE: Feedback-Efficient Interactive Reinforcement Learning via Relabeling Experience and Unsupervised Pre-training
Kimin Lee, Laura Smith, Pieter Abbeel
Poster
Tue 21:00 GMAC: A Distributional Perspective on Actor-Critic Framework
Daniel Nam, Younghoon Kim, Chan Park
Poster
Tue 21:00 RRL: Resnet as representation for Reinforcement Learning
Rutav Shah, Vikash Kumar
Poster
Tue 21:00 Low-Precision Reinforcement Learning: Running Soft Actor-Critic in Half Precision
Johan Björck, Xiangyu Chen, Christopher De Sa, Carla Gomes, Kilian Weinberger
Poster
Tue 21:00 Goal-Conditioned Reinforcement Learning with Imagined Subgoals
Elliot Chane-Sane, Cordelia Schmid, Ivan Laptev
Poster
Tue 21:00 Prioritized Level Replay
Minqi Jiang, Edward Grefenstette, Tim Rocktäschel
Poster
Tue 21:00 Towards Better Laplacian Representation in Reinforcement Learning with Generalized Graph Drawing
Kaixin Wang, Kuangqi Zhou, Qixin Zhang, Jie Shao, Bryan Hooi, Jiashi Feng
Poster
Tue 21:00 On Proximal Policy Optimization's Heavy-tailed Gradients
Saurabh Garg, Joshua Zhanson, Emilio Parisotto, Adarsh Prasad, Zico Kolter, Zachary Lipton, Sivaraman Balakrishnan, Russ Salakhutdinov, Pradeep Ravikumar
Poster
Tue 21:00 PsiPhi-Learning: Reinforcement Learning with Demonstrations using Successor Features and Inverse Temporal Difference Learning
Angelos Filos, Clare Lyle, Yarin Gal, Sergey Levine, Natasha Jaques, Gregory Farquhar
Poster
Tue 21:00 Reinforcement Learning of Implicit and Explicit Control Flow Instructions
Ethan Brooks, Janarthanan Rajendran, Richard Lewis, Satinder Singh
Poster
Tue 21:00 Convex Regularization in Monte-Carlo Tree Search
Tuan Q Dam, Carlo D'Eramo, Jan Peters, Joni Pajarinen
Poster
Tue 21:00 SCC: an efficient deep reinforcement learning agent mastering the game of StarCraft II
Xiangjun Wang, Junxiao SONG, Penghui Qi, Peng Peng, Zhenkun Tang, Wei Zhang, Weimin Li, Xiongjun Pi, Jujie He, Chao Gao, Haitao Long, Quan Yuan
Poster
Tue 21:00 FOP: Factorizing Optimal Joint Policy of Maximum-Entropy Multi-Agent Reinforcement Learning
Tianhao Zhang, 岳珩 李, Chen Wang, Guangming Xie, Zongqing Lu
Poster
Tue 21:00 Learning and Planning in Average-Reward Markov Decision Processes
Yi Wan, Abhishek Naik, Richard Sutton
Poster
Tue 21:00 Locally Persistent Exploration in Continuous Control Tasks with Sparse Rewards
Susan Amin, Maziar Gomrokchi, Hossein Aboutalebi, Harsh Satija, Doina Precup
Poster
Tue 21:00 Probabilistic Sequential Shrinking: A Best Arm Identification Algorithm for Stochastic Bandits with Corruptions
Zixin Zhong, Wang Chi Cheung, Vincent Tan
Poster
Tue 21:00 Keyframe-Focused Visual Imitation Learning
Chuan Wen, Jierui Lin, Jianing Qian, Yang Gao, Dinesh Jayaraman
Poster
Tue 21:00 Multi-Task Reinforcement Learning with Context-based Representations
Shagun Sodhani, Amy Zhang, Joelle Pineau
Poster
Tue 21:00 Learning Fair Policies in Decentralized Cooperative Multi-Agent Reinforcement Learning
Matthieu Zimmer, Claire Glanois, Umer Siddique, Paul Weng
Poster
Tue 21:00 Taylor Expansion of Discount Factors
Yunhao Tang, Mark Rowland, Remi Munos, Michal Valko
Poster
Tue 21:00 From Local to Global Norm Emergence: Dissolving Self-reinforcing Substructures with Incremental Social Instruments
Yiwei Liu, Jiamou Liu, Kaibin Wan, Zhan Qin, Zijian Zhang, Bakhadyr Khoussainov, Liehuang Zhu
Poster
Tue 21:00 A Deep Reinforcement Learning Approach to Marginalized Importance Sampling with the Successor Representation
Scott Fujimoto, David Meger, Doina Precup
Poster
Tue 21:00 Differentiable Spatial Planning using Transformers
Devendra Singh Chaplot, Deepak Pathak, Jitendra Malik
Poster
Tue 21:00 Discovering symbolic policies with deep reinforcement learning
Mikel Landajuela Larma, Brenden Petersen, Sookyung Kim, Claudio Santiago, Ruben Glatt, Nathan Mundhenk, Jacob Pettit, Daniel Faissol
Poster
Tue 21:00 Emergent Social Learning via Multi-agent Reinforcement Learning
Kamal Ndousse, Douglas Eck, Sergey Levine, Natasha Jaques
Poster
Tue 21:00 Cooperative Exploration for Multi-Agent Deep Reinforcement Learning
Iou-Jen Liu, Unnat Jain, Raymond Yeh, Alex Schwing
Poster
Tue 21:00 Learning to Weight Imperfect Demonstrations
Yunke Wang, Chang Xu, Bo Du, Honglak Lee
Poster
Tue 21:00 Revisiting Peng's Q($\lambda$) for Modern Reinforcement Learning
Tadashi Kozuno, Yunhao Tang, Mark Rowland, Remi Munos, Steven Kapturowski, Will Dabney, Michal Valko, Dave Abel
Poster
Tue 21:00 MURAL: Meta-Learning Uncertainty-Aware Rewards for Outcome-Driven Reinforcement Learning
Kevin Li, Abhishek Gupta, Ashwin D Reddy, Vitchyr Pong, Aurick Zhou, Justin Yu, Sergey Levine
Poster
Tue 21:00 Emphatic Algorithms for Deep Reinforcement Learning
Ray Jiang, Tom Zahavy, Zhongwen Xu, Adam White, Matteo Hessel, Charles Blundell, Hado van Hasselt
Poster
Tue 21:00 From Poincaré Recurrence to Convergence in Imperfect Information Games: Finding Equilibrium via Regularization
Julien Perolat, Remi Munos, Jean-Baptiste Lespiau, Shayegan Omidshafiei, Mark Rowland, Pedro Ortega, Neil Burch, Thomas Anthony, David Balduzzi, Bart De Vylder, Georgios Piliouras, Marc Lanctot, Karl Tuyls
Poster
Tue 21:00 Inverse Decision Modeling: Learning Interpretable Representations of Behavior
Daniel Jarrett, Alihan Hüyük, Mihaela van der Schaar
Poster
Tue 21:00 Generalizable Episodic Memory for Deep Reinforcement Learning
Hao Hu, Jianing Ye, Guangxiang Zhu, Zhizhou Ren, Chongjie Zhang
Poster
Tue 21:00 DFAC Framework: Factorizing the Value Function via Quantile Mixture for Multi-Agent Distributional Q-Learning
Wei-Fang Sun, Cheng-Kuang Lee, Chun-Yi Lee
Poster
Tue 21:00 The Emergence of Individuality
Jiechuan Jiang, Zongqing Lu
Poster
Tue 21:00 DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning
Daochen Zha, Jingru Xie, Wenye Ma, Sheng Zhang, Xiangru Lian, Xia Hu, Ji Liu
Poster
Tue 21:00 Uncertainty Weighted Actor-Critic for Offline Reinforcement Learning
Yue Wu, Shuangfei Zhai, Nitish Srivastava, Josh M Susskind, Jian Zhang, Russ Salakhutdinov, Hanlin Goh
Poster
Tue 21:00 Shortest-Path Constrained Reinforcement Learning for Sparse Reward Tasks
Sungryull Sohn, Sungtae Lee, Jongwook Choi, Harm van Seijen, Mehdi Fatemi, Honglak Lee
Poster
Tue 21:00 Safe Reinforcement Learning with Linear Function Approximation
Sanae Amani, Christos Thrampoulidis, Lin Yang
Poster
Tue 21:00 Beyond Variance Reduction: Understanding the True Impact of Baselines on Policy Optimization
Wes Chung, Valentin Thomas, Marlos C. Machado, Nicolas Le Roux
Poster
Tue 21:00 Variational Empowerment as Representation Learning for Goal-Conditioned Reinforcement Learning
Jongwook Choi, Archit Sharma, Honglak Lee, Sergey Levine, Shixiang Gu
Poster
Tue 21:00 Is Pessimism Provably Efficient for Offline RL?
Ying Jin, Zhuoran Yang, Zhaoran Wang
Poster
Tue 21:00 Representation Matters: Offline Pretraining for Sequential Decision Making
Mengjiao Yang, Ofir Nachum
Poster
Tue 21:00 Policy Information Capacity: Information-Theoretic Measure for Task Complexity in Deep Reinforcement Learning
Hiroki Furuta, Tatsuya Matsushima, Tadashi Kozuno, Yutaka Matsuo, Sergey Levine, Ofir Nachum, Shixiang Gu
Poster
Tue 21:00 Recomposing the Reinforcement Learning Building Blocks with Hypernetworks
Elad Sarafian, Shai Keynan, Sarit Kraus
Poster
Tue 21:00 Randomized Entity-wise Factorization for Multi-Agent Reinforcement Learning
Shariq Iqbal, Christian Schroeder, Bei Peng, Wendelin Boehmer, Shimon Whiteson, Fei Sha
Oral
Wed 5:00 Cross-domain Imitation from Observations
Dripta S. Raychaudhuri, Sujoy Paul, Jeroen Vanbaar, Amit Roy-Chowdhury
Oral
Wed 5:00 APS: Active Pretraining with Successor Features
Hao Liu, Pieter Abbeel
Spotlight
Wed 5:20 SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep Reinforcement Learning
Kimin Lee, Michael Laskin, Aravind Srinivas, Pieter Abbeel
Spotlight
Wed 5:20 Guided Exploration with Proximal Policy Optimization using a Single Demonstration
Gabriele Libardi, Gianni De Fabritiis, Sebastian Dittert
Spotlight
Wed 5:25 Decoupling Exploration and Exploitation for Meta-Reinforcement Learning without Sacrifices
Evan Liu, Aditi Raghunathan, Percy Liang, Chelsea Finn
Spotlight
Wed 5:25 Self-Paced Context Evaluation for Contextual Reinforcement Learning
Theresa Eimer, André Biedenkapp, Frank Hutter, Marius Lindauer
Spotlight
Wed 5:30 Active Feature Acquisition with Generative Surrogate Models
Yang Li, Junier Oliva
Spotlight
Wed 5:30 Unsupervised Skill Discovery with Bottleneck Option Learning
Jaekyeom Kim, Seohong Park, Gunhee Kim
Spotlight
Wed 5:30 Model-Free and Model-Based Policy Evaluation when Causality is Uncertain
David Bruns-Smith
Spotlight
Wed 5:35 Characterizing the Gap Between Actor-Critic and Policy Gradient
Junfeng Wen, Saurabh Kumar, Ramki Gummadi, Dale Schuurmans
Spotlight
Wed 5:40 Grounding Language to Entities and Dynamics for Generalization in Reinforcement Learning
Austin W. Hanjie, Victor Zhong, Karthik Narasimhan
Spotlight
Wed 5:40 Spectral Normalisation for Deep Reinforcement Learning: An Optimisation Perspective
Florin Gogianu, Tudor Berariu, Mihaela Rosca, Claudia Clopath, Lucian Busoniu, Razvan Pascanu
Spotlight
Wed 5:45 Data-efficient Hindsight Off-policy Option Learning
Markus Wulfmeier, Dushyant Rao, Roland Hafner, Thomas Lampe, Abbas Abdolmaleki, Tim Hertweck, Michael Neunert, Dhruva Tirumala Bukkapatnam, Noah Siegel, Nicolas Heess, Martin Riedmiller
Spotlight
Wed 5:45 Accelerating Safe Reinforcement Learning with Constraint-mismatched Baseline Policies
Jimmy Yang, Justinian Rosca, Karthik Narasimhan, Peter Ramadge
Oral
Wed 6:00 Provably Efficient Fictitious Play Policy Optimization for Zero-Sum Markov Games with Structured Transitions
Shuang Qiu, Xiaohan Wei, Jieping Ye, Zhaoran Wang, Zhuoran Yang
Oral
Wed 6:00 Model-based Reinforcement Learning for Continuous Control with Posterior Sampling
Ying Fan, Yifei Ming
Spotlight
Wed 6:20 Principled Exploration via Optimistic Bootstrapping and Backward Induction
Chenjia Bai, Lingxiao Wang, Lei Han, Jianye Hao, Animesh Garg, Peng Liu, Zhaoran Wang
Spotlight
Wed 6:20 Megaverse: Simulating Embodied Agents at One Million Experiences per Second
Aleksei Petrenko, Erik Wijmans, Brennan Shacklett, Vladlen Koltun
Spotlight
Wed 6:25 Ensemble Bootstrapping for Q-Learning
Oren Peer, Chen Tessler, Nadav Merlis, Ron Meir
Spotlight
Wed 6:25 Scaling Multi-Agent Reinforcement Learning with Selective Parameter Sharing
Filippos Christianos, Georgios Papoudakis, Arrasy Rahman, Stefano V. Albrecht
Spotlight
Wed 6:30 Towards Open Ad Hoc Teamwork Using Graph-based Policy Learning
Arrasy Rahman, Niklas Hopner, Filippos Christianos, Stefano V. Albrecht
Spotlight
Wed 6:30 Finite-Sample Analysis of Off-Policy Natural Actor-Critic Algorithm
sajad khodadadian, Zaiwei Chen, Siva Maguluri
Spotlight
Wed 6:35 Off-Belief Learning
Hengyuan Hu, Adam Lerer, Brandon Cui, Luis Pineda, Noam Brown, Jakob Foerster
Spotlight
Wed 6:35 A Regret Minimization Approach to Iterative Learning Control
Naman Agarwal, Elad Hazan, Anirudha Majumdar, Karan Singh
Spotlight
Wed 6:40 TempoRL: Learning When to Act
André Biedenkapp, Raghu Rajan, Frank Hutter, Marius Lindauer
Spotlight
Wed 6:45 Combining Pessimism with Optimism for Robust and Efficient Model-Based Deep Reinforcement Learning
Sebastian Curi, Ilija Bogunovic, Andreas Krause
Spotlight
Wed 6:45 State Relevance for Off-Policy Evaluation
Simon Shen, Jason Ma, Omer Gottesman, Finale Doshi-Velez
Oral
Wed 7:00 High-dimensional Experimental Design and Kernel Bandits
Romain Camilleri, Kevin Jamieson, Julian Katz-Samuels
Spotlight
Wed 7:00 Instabilities of Offline RL with Pre-Trained Neural Representation
Ruosong Wang, Yifan Wu, Russ Salakhutdinov, Sham Kakade
Oral
Wed 7:00 The Logical Options Framework
Brandon Araki, Xiao Li, Kiran Vodrahalli, Jonathan DeCastro, Micah Fry, Daniela Rus
Spotlight
Wed 7:05 Path Planning using Neural A* Search
Ryo Yonetani, Tatsunori Taniai, Mohammadamin Barekatain, Mai Nishimura, Asako Kanezaki
Spotlight
Wed 7:15 Tightening the Dependence on Horizon in the Sample Complexity of Q-Learning
Gen Li, Changxiao Cai, Yuxin Chen, Yuantao Gu, Yuting Wei, Yuejie Chi
Spotlight
Wed 7:20 Solving Challenging Dexterous Manipulation Tasks With Trajectory Optimisation and Reinforcement Learning
Henry Charlesworth, Giovanni Montana
Spotlight
Wed 7:20 Dichotomous Optimistic Search to Quantify Human Perception
Julien Audiffren
Oral
Wed 7:20 On Reward-Free RL with Kernel and Neural Function Approximations: Single-Agent MDP and Markov Game
Shuang Qiu, Jieping Ye, Zhaoran Wang, Zhuoran Yang
Spotlight
Wed 7:25 Improved Confidence Bounds for the Linear Logistic Model and Applications to Bandits
Kwang-Sung Jun, Lalit Jain, Blake Mason, Houssam Nassif
Spotlight
Wed 7:25 Continuous-time Model-based Reinforcement Learning
Cagatay Yildiz, Markus Heinonen, Harri Lähdesmäki
Spotlight
Wed 7:30 Stochastic Multi-Armed Bandits with Unrestricted Delay Distributions
Tal Lancewicki, Shahar Segal, Tomer Koren, Yishay Mansour
Spotlight
Wed 7:35 Best Model Identification: A Rested Bandit Formulation
Leonardo Cella, Massimiliano Pontil, Claudio Gentile
Spotlight
Wed 7:35 Deciding What to Learn: A Rate-Distortion Approach
Dilip Arumugam, Benjamin Van Roy
Spotlight
Wed 7:40 No-regret Algorithms for Capturing Events in Poisson Point Processes
Mojmir Mutny, Andreas Krause
Spotlight
Wed 7:40 Adversarial Option-Aware Hierarchical Imitation Learning
Mingxuan Jing, Wenbing Huang, Fuchun Sun, Xiaojian Ma, Tao Kong, Chuang Gan, Lei Li
Spotlight
Wed 7:45 Parametric Graph for Unimodal Ranking Bandit
CamilleS GAUTHIER, Romaric Gaudel, Elisa Fromont, Boammani Aser Lompo
Spotlight
Wed 7:45 Value Iteration in Continuous Actions, States and Time
Michael Lutter, Shie Mannor, Jan Peters, Dieter Fox, Animesh Garg
Poster
Wed 9:00 TempoRL: Learning When to Act
André Biedenkapp, Raghu Rajan, Frank Hutter, Marius Lindauer
Poster
Wed 9:00 High-dimensional Experimental Design and Kernel Bandits
Romain Camilleri, Kevin Jamieson, Julian Katz-Samuels
Poster
Wed 9:00 Off-Belief Learning
Hengyuan Hu, Adam Lerer, Brandon Cui, Luis Pineda, Noam Brown, Jakob Foerster
Poster
Wed 9:00 A Regret Minimization Approach to Iterative Learning Control
Naman Agarwal, Elad Hazan, Anirudha Majumdar, Karan Singh
Poster
Wed 9:00 Spectral Normalisation for Deep Reinforcement Learning: An Optimisation Perspective
Florin Gogianu, Tudor Berariu, Mihaela Rosca, Claudia Clopath, Lucian Busoniu, Razvan Pascanu
Poster
Wed 9:00 Combining Pessimism with Optimism for Robust and Efficient Model-Based Deep Reinforcement Learning
Sebastian Curi, Ilija Bogunovic, Andreas Krause
Poster
Wed 9:00 Tightening the Dependence on Horizon in the Sample Complexity of Q-Learning
Gen Li, Changxiao Cai, Yuxin Chen, Yuantao Gu, Yuting Wei, Yuejie Chi
Poster
Wed 9:00 Improved Confidence Bounds for the Linear Logistic Model and Applications to Bandits
Kwang-Sung Jun, Lalit Jain, Blake Mason, Houssam Nassif
Poster
Wed 9:00 Characterizing the Gap Between Actor-Critic and Policy Gradient
Junfeng Wen, Saurabh Kumar, Ramki Gummadi, Dale Schuurmans
Poster
Wed 9:00 Self-Paced Context Evaluation for Contextual Reinforcement Learning
Theresa Eimer, André Biedenkapp, Frank Hutter, Marius Lindauer
Poster
Wed 9:00 Principled Exploration via Optimistic Bootstrapping and Backward Induction
Chenjia Bai, Lingxiao Wang, Lei Han, Jianye Hao, Animesh Garg, Peng Liu, Zhaoran Wang
Poster
Wed 9:00 Grounding Language to Entities and Dynamics for Generalization in Reinforcement Learning
Austin W. Hanjie, Victor Zhong, Karthik Narasimhan
Poster
Wed 9:00 Parametric Graph for Unimodal Ranking Bandit
CamilleS GAUTHIER, Romaric Gaudel, Elisa Fromont, Boammani Aser Lompo
Poster
Wed 9:00 No-regret Algorithms for Capturing Events in Poisson Point Processes
Mojmir Mutny, Andreas Krause
Poster
Wed 9:00 On Reward-Free RL with Kernel and Neural Function Approximations: Single-Agent MDP and Markov Game
Shuang Qiu, Jieping Ye, Zhaoran Wang, Zhuoran Yang
Poster
Wed 9:00 Provably Efficient Fictitious Play Policy Optimization for Zero-Sum Markov Games with Structured Transitions
Shuang Qiu, Xiaohan Wei, Jieping Ye, Zhaoran Wang, Zhuoran Yang
Poster
Wed 9:00 Cross-domain Imitation from Observations
Dripta S. Raychaudhuri, Sujoy Paul, Jeroen Vanbaar, Amit Roy-Chowdhury
Poster
Wed 9:00 Model-Free and Model-Based Policy Evaluation when Causality is Uncertain
David Bruns-Smith
Poster
Wed 9:00 Megaverse: Simulating Embodied Agents at One Million Experiences per Second
Aleksei Petrenko, Erik Wijmans, Brennan Shacklett, Vladlen Koltun
Poster
Wed 9:00 Decoupling Exploration and Exploitation for Meta-Reinforcement Learning without Sacrifices
Evan Liu, Aditi Raghunathan, Percy Liang, Chelsea Finn
Poster
Wed 9:00 Guided Exploration with Proximal Policy Optimization using a Single Demonstration
Gabriele Libardi, Gianni De Fabritiis, Sebastian Dittert
Poster
Wed 9:00 State Relevance for Off-Policy Evaluation
Simon Shen, Jason Ma, Omer Gottesman, Finale Doshi-Velez
Poster
Wed 9:00 Dichotomous Optimistic Search to Quantify Human Perception
Julien Audiffren
Poster
Wed 9:00 Adversarial Option-Aware Hierarchical Imitation Learning
Mingxuan Jing, Wenbing Huang, Fuchun Sun, Xiaojian Ma, Tao Kong, Chuang Gan, Lei Li
Poster
Wed 9:00 The Logical Options Framework
Brandon Araki, Xiao Li, Kiran Vodrahalli, Jonathan DeCastro, Micah Fry, Daniela Rus
Poster
Wed 9:00 Scaling Multi-Agent Reinforcement Learning with Selective Parameter Sharing
Filippos Christianos, Georgios Papoudakis, Arrasy Rahman, Stefano V. Albrecht
Poster
Wed 9:00 Model-based Reinforcement Learning for Continuous Control with Posterior Sampling
Ying Fan, Yifei Ming
Poster
Wed 9:00 Value Iteration in Continuous Actions, States and Time
Michael Lutter, Shie Mannor, Jan Peters, Dieter Fox, Animesh Garg
Poster
Wed 9:00 Continuous-time Model-based Reinforcement Learning
Cagatay Yildiz, Markus Heinonen, Harri Lähdesmäki
Poster
Wed 9:00 Unsupervised Skill Discovery with Bottleneck Option Learning
Jaekyeom Kim, Seohong Park, Gunhee Kim
Poster
Wed 9:00 Deciding What to Learn: A Rate-Distortion Approach
Dilip Arumugam, Benjamin Van Roy
Poster
Wed 9:00 Solving Challenging Dexterous Manipulation Tasks With Trajectory Optimisation and Reinforcement Learning
Henry Charlesworth, Giovanni Montana
Poster
Wed 9:00 Path Planning using Neural A* Search
Ryo Yonetani, Tatsunori Taniai, Mohammadamin Barekatain, Mai Nishimura, Asako Kanezaki
Poster
Wed 9:00 Active Feature Acquisition with Generative Surrogate Models
Yang Li, Junier Oliva
Poster
Wed 9:00 Finite-Sample Analysis of Off-Policy Natural Actor-Critic Algorithm
sajad khodadadian, Zaiwei Chen, Siva Maguluri
Poster
Wed 9:00 SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep Reinforcement Learning
Kimin Lee, Michael Laskin, Aravind Srinivas, Pieter Abbeel
Poster
Wed 9:00 Ensemble Bootstrapping for Q-Learning
Oren Peer, Chen Tessler, Nadav Merlis, Ron Meir
Poster
Wed 9:00 Stochastic Multi-Armed Bandits with Unrestricted Delay Distributions
Tal Lancewicki, Shahar Segal, Tomer Koren, Yishay Mansour
Poster
Wed 9:00 Best Model Identification: A Rested Bandit Formulation
Leonardo Cella, Massimiliano Pontil, Claudio Gentile
Poster
Wed 9:00 APS: Active Pretraining with Successor Features
Hao Liu, Pieter Abbeel
Poster
Wed 9:00 Towards Open Ad Hoc Teamwork Using Graph-based Policy Learning
Arrasy Rahman, Niklas Hopner, Filippos Christianos, Stefano V. Albrecht
Poster
Wed 9:00 Data-efficient Hindsight Off-policy Option Learning
Markus Wulfmeier, Dushyant Rao, Roland Hafner, Thomas Lampe, Abbas Abdolmaleki, Tim Hertweck, Michael Neunert, Dhruva Tirumala Bukkapatnam, Noah Siegel, Nicolas Heess, Martin Riedmiller
Poster
Wed 9:00 Accelerating Safe Reinforcement Learning with Constraint-mismatched Baseline Policies
Jimmy Yang, Justinian Rosca, Karthik Narasimhan, Peter Ramadge
Poster
Wed 9:00 Instabilities of Offline RL with Pre-Trained Neural Representation
Ruosong Wang, Yifan Wu, Russ Salakhutdinov, Sham Kakade
Oral
Wed 17:00 The Symmetry between Arms and Knapsacks: A Primal-Dual Approach for Bandits with Knapsacks
Xiaocheng Li, Chunlin Sun, Yinyu Ye
Spotlight
Wed 17:20 Provably Efficient Reinforcement Learning for Discounted MDPs with Feature Mapping
Dongruo Zhou, Jiafan He, Quanquan Gu
Spotlight
Wed 17:20 Dynamic Planning and Learning under Recovering Rewards
David Simchi-Levi, Zeyu Zheng, Feng Zhu
Spotlight
Wed 17:25 Best Arm Identification in Graphical Bilinear Bandits
Geovani Rizk, Albert Thomas, Igor Colin, Rida Laraki, Yann Chevaleyre
Spotlight
Wed 17:30 Achieving Near Instance-Optimality and Minimax-Optimality in Stochastic and Adversarial Linear Bandits Simultaneously
Chung-Wei Lee, Haipeng Luo, Chen-Yu Wei, Mengxiao Zhang, Xiaojin Zhang
Spotlight
Wed 17:35 Incentivized Bandit Learning with Self-Reinforcing User Preferences
Tianchen Zhou, Jia Liu, Chaosheng Dong, jingyuan deng
Spotlight
Wed 17:40 Approximation Theory Based Methods for RKHS Bandits
Sho Takemori, Masahiro Sato
Spotlight
Wed 17:45 Dynamic Balancing for Model Selection in Bandits and RL
Ashok Cutkosky, Christoph Dann, Abhimanyu Das, Claudio Gentile, Aldo Pacchiano, Manish Purohit
Oral
Wed 18:00 Cyclically Equivariant Neural Decoders for Cyclic Codes
Xiangyu Chen, Min Ye
Oral
Wed 18:00 Resource Allocation in Multi-armed Bandit Exploration: Overcoming Sublinear Scaling with Adaptive Parallelism
Brijen Thananjeyan, Kirthevasan Kandasamy, Ion Stoica, Michael Jordan, Ken Goldberg, Joseph E Gonzalez
Spotlight
Wed 18:20 Optimal Streaming Algorithms for Multi-Armed Bandits
Tianyuan Jin, Keke Huang, Jing Tang, Xiaokui Xiao
Spotlight
Wed 18:25 Top-k eXtreme Contextual Bandits with Arm Hierarchy
Rajat Sen, Alexander Rakhlin, Lexing Ying, Rahul Kidambi, Dean Foster, Daniel Hill, Inderjit Dhillon
Spotlight
Wed 18:30 Improved Regret Bounds of Bilinear Bandits using Action Space Analysis
Kyoungseok Jang, Kwang-Sung Jun, Se-Young Yun, Wanmo Kang
Spotlight
Wed 18:35 Interaction-Grounded Learning
Tengyang Xie, John Langford, Paul Mineiro, Ida Momennejad
Spotlight
Wed 18:35 Deep Coherent Exploration for Continuous Control
Yijie Zhang, Herke van Hoof
Spotlight
Wed 18:40 Almost Optimal Anytime Algorithm for Batched Multi-Armed Bandits
Tianyuan Jin, Jing Tang, Pan Xu, Keke Huang, Xiaokui Xiao, Quanquan Gu
Spotlight
Wed 18:45 Pure Exploration and Regret Minimization in Matching Bandits
Flore Sentenac, Jialin Yi, Clément Calauzènes, Vianney Perchet, Milan Vojnovic
Oral
Wed 19:00 Multi-layered Network Exploration via Random Walks: From Offline Optimization to Online Learning
Xutong Liu, Jinhang Zuo, Xiaowei Chen, Wei Chen, John C. S. Lui
Spotlight
Wed 19:20 Combinatorial Blocking Bandits with Stochastic Delays
Alexia Atsidakou, Orestis Papadigenopoulos, Soumya Basu, Constantine Caramanis, Sanjay Shakkottai
Spotlight
Wed 19:20 A Differentiable Point Process with Its Application to Spiking Neural Networks
Hiroshi Kajino
Spotlight
Wed 19:25 Sparsity-Agnostic Lasso Bandit
Min-hwan Oh, Garud Iyengar, Assaf Zeevi
Spotlight
Wed 19:25 AdaXpert: Adapting Neural Architecture for Growing Data
Shuaicheng Niu, Jiaxiang Wu, Guanghui Xu, Yifan Zhang, Yong Guo, Peilin Zhao, Peng Wang, Mingkui Tan
Spotlight
Wed 19:30 Quantile Bandits for Best Arms Identification
Mengyan Zhang, Cheng Soon Ong
Spotlight
Wed 19:35 Beyond $log^2(T)$ regret for decentralized bandits in matching markets
Soumya Basu, Karthik Abinav Sankararaman, Abishek Sankararaman
Spotlight
Wed 19:40 Robust Pure Exploration in Linear Bandits with Limited Budget
Ayya Alieva, Ashok Cutkosky, Abhimanyu Das
Spotlight
Wed 19:40 Massively Parallel and Asynchronous Tsetlin Machine Architecture Supporting Almost Constant-Time Scaling
Kuruge Darshana Abeyrathna, Bimal Bhattarai, Morten Goodwin, Saeed Rahimi Gorji, Ole-Christoffer Granmo, Lei Jiao, Rupsa Saha, Rohan Kumar Yadav
Spotlight
Wed 19:45 Adapting to misspecification in contextual bandits with offline regression oracles
Sanath Kumar Krishnamurthy, Vitor Hadad, Susan Athey
Poster
Wed 21:00 Deep Coherent Exploration for Continuous Control
Yijie Zhang, Herke van Hoof
Poster
Wed 21:00 Sparsity-Agnostic Lasso Bandit
Min-hwan Oh, Garud Iyengar, Assaf Zeevi
Poster
Wed 21:00 Improved Regret Bounds of Bilinear Bandits using Action Space Analysis
Kyoungseok Jang, Kwang-Sung Jun, Se-Young Yun, Wanmo Kang
Poster
Wed 21:00 Multi-layered Network Exploration via Random Walks: From Offline Optimization to Online Learning
Xutong Liu, Jinhang Zuo, Xiaowei Chen, Wei Chen, John C. S. Lui
Poster
Wed 21:00 Dynamic Planning and Learning under Recovering Rewards
David Simchi-Levi, Zeyu Zheng, Feng Zhu
Poster
Wed 21:00 Pure Exploration and Regret Minimization in Matching Bandits
Flore Sentenac, Jialin Yi, Clément Calauzènes, Vianney Perchet, Milan Vojnovic
Poster
Wed 21:00 Robust Pure Exploration in Linear Bandits with Limited Budget
Ayya Alieva, Ashok Cutkosky, Abhimanyu Das
Poster
Wed 21:00 Dynamic Balancing for Model Selection in Bandits and RL
Ashok Cutkosky, Christoph Dann, Abhimanyu Das, Claudio Gentile, Aldo Pacchiano, Manish Purohit
Poster
Wed 21:00 Approximation Theory Based Methods for RKHS Bandits
Sho Takemori, Masahiro Sato
Poster
Wed 21:00 Provably Efficient Reinforcement Learning for Discounted MDPs with Feature Mapping
Dongruo Zhou, Jiafan He, Quanquan Gu
Poster
Wed 21:00 Almost Optimal Anytime Algorithm for Batched Multi-Armed Bandits
Tianyuan Jin, Jing Tang, Pan Xu, Keke Huang, Xiaokui Xiao, Quanquan Gu
Poster
Wed 21:00 Combinatorial Blocking Bandits with Stochastic Delays
Alexia Atsidakou, Orestis Papadigenopoulos, Soumya Basu, Constantine Caramanis, Sanjay Shakkottai
Poster
Wed 21:00 Interaction-Grounded Learning
Tengyang Xie, John Langford, Paul Mineiro, Ida Momennejad
Poster
Wed 21:00 Optimal Streaming Algorithms for Multi-Armed Bandits
Tianyuan Jin, Keke Huang, Jing Tang, Xiaokui Xiao
Poster
Wed 21:00 The Symmetry between Arms and Knapsacks: A Primal-Dual Approach for Bandits with Knapsacks
Xiaocheng Li, Chunlin Sun, Yinyu Ye
Poster
Wed 21:00 Achieving Near Instance-Optimality and Minimax-Optimality in Stochastic and Adversarial Linear Bandits Simultaneously
Chung-Wei Lee, Haipeng Luo, Chen-Yu Wei, Mengxiao Zhang, Xiaojin Zhang
Poster
Wed 21:00 Resource Allocation in Multi-armed Bandit Exploration: Overcoming Sublinear Scaling with Adaptive Parallelism
Brijen Thananjeyan, Kirthevasan Kandasamy, Ion Stoica, Michael Jordan, Ken Goldberg, Joseph E Gonzalez
Poster
Wed 21:00 A Differentiable Point Process with Its Application to Spiking Neural Networks
Hiroshi Kajino
Poster
Wed 21:00 Beyond $log^2(T)$ regret for decentralized bandits in matching markets
Soumya Basu, Karthik Abinav Sankararaman, Abishek Sankararaman
Poster
Wed 21:00 Incentivized Bandit Learning with Self-Reinforcing User Preferences
Tianchen Zhou, Jia Liu, Chaosheng Dong, jingyuan deng
Poster
Wed 21:00 Massively Parallel and Asynchronous Tsetlin Machine Architecture Supporting Almost Constant-Time Scaling
Kuruge Darshana Abeyrathna, Bimal Bhattarai, Morten Goodwin, Saeed Rahimi Gorji, Ole-Christoffer Granmo, Lei Jiao, Rupsa Saha, Rohan Kumar Yadav
Poster
Wed 21:00 Quantile Bandits for Best Arms Identification
Mengyan Zhang, Cheng Soon Ong
Poster
Wed 21:00 Adapting to misspecification in contextual bandits with offline regression oracles
Sanath Kumar Krishnamurthy, Vitor Hadad, Susan Athey
Poster
Wed 21:00 Top-k eXtreme Contextual Bandits with Arm Hierarchy
Rajat Sen, Alexander Rakhlin, Lexing Ying, Rahul Kidambi, Dean Foster, Daniel Hill, Inderjit Dhillon
Poster
Wed 21:00 Cyclically Equivariant Neural Decoders for Cyclic Codes
Xiangyu Chen, Min Ye
Poster
Wed 21:00 Best Arm Identification in Graphical Bilinear Bandits
Geovani Rizk, Albert Thomas, Igor Colin, Rida Laraki, Yann Chevaleyre
Poster
Wed 21:00 AdaXpert: Adapting Neural Architecture for Growing Data
Shuaicheng Niu, Jiaxiang Wu, Guanghui Xu, Yifan Zhang, Yong Guo, Peilin Zhao, Peng Wang, Mingkui Tan
Spotlight
Thu 5:45 Detecting Rewards Deterioration in Episodic Reinforcement Learning
Ido Greenberg, Shie Mannor
Spotlight
Thu 5:45 Directional Bias Amplification
Angelina Wang, Olga Russakovsky
Oral
Thu 6:00 Deep Adaptive Design: Amortizing Sequential Bayesian Experimental Design
Adam Foster, Desi Ivanova, ILYAS MALIK, Tom Rainforth
Spotlight
Thu 6:25 Off-Policy Confidence Sequences
Nikos Karampatziakis, Paul Mineiro, Aaditya Ramdas
Spotlight
Thu 6:40 Posterior Value Functions: Hindsight Baselines for Policy Gradient Methods
Chris Nota, Philip Thomas, Bruno C. da Silva
Spotlight
Thu 7:20 Unified Robust Semi-Supervised Variational Autoencoder
Xu Chen
Spotlight
Thu 7:35 Online Limited Memory Neural-Linear Bandits with Likelihood Matching
Ofir Nabati, Tom Zahavy, Shie Mannor
Poster
Thu 9:00 Deep Adaptive Design: Amortizing Sequential Bayesian Experimental Design
Adam Foster, Desi Ivanova, ILYAS MALIK, Tom Rainforth
Poster
Thu 9:00 Off-Policy Confidence Sequences
Nikos Karampatziakis, Paul Mineiro, Aaditya Ramdas
Poster
Thu 9:00 Posterior Value Functions: Hindsight Baselines for Policy Gradient Methods
Chris Nota, Philip Thomas, Bruno C. da Silva
Poster
Thu 9:00 Unified Robust Semi-Supervised Variational Autoencoder
Xu Chen
Poster
Thu 9:00 Directional Bias Amplification
Angelina Wang, Olga Russakovsky
Poster
Thu 9:00 Adapting to Delays and Data in Adversarial Multi-Armed Bandits
András György, Pooria Joulani
Poster
Thu 9:00 Online Limited Memory Neural-Linear Bandits with Likelihood Matching
Ofir Nabati, Tom Zahavy, Shie Mannor
Poster
Thu 9:00 Detecting Rewards Deterioration in Episodic Reinforcement Learning
Ido Greenberg, Shie Mannor
Spotlight
Thu 17:45 Structured World Belief for Reinforcement Learning in POMDP
Gautam Singh, Skand Peri, Junghyun Kim, Hyunseok Kim, Sungjin Ahn
Spotlight
Thu 18:25 Policy Caches with Successor Features
Mark Nemecek, Ron Parr
Spotlight
Thu 18:30 Meta-Thompson Sampling
Branislav Kveton, Mikhail Konobeev, Manzil Zaheer, Chih-wei Hsu, Martin Mladenov, Craig Boutilier, Csaba Szepesvari
Spotlight
Thu 18:35 Decentralized Single-Timescale Actor-Critic on Zero-Sum Two-Player Stochastic Games
Hongyi Guo, Zuyue Fu, Zhuoran Yang, Zhaoran Wang
Oral
Thu 19:00 Differentially Private Sliced Wasserstein Distance
alain rakotomamonjy, Ralaivola Liva
Spotlight
Thu 19:20 Near-Optimal Entrywise Anomaly Detection for Low-Rank Matrices with Sub-Exponential Noise
Vivek Farias, Andrew Li, Tianyi Peng
Spotlight
Thu 20:30 On Lower Bounds for Standard and Robust Gaussian Process Bandit Optimization
Xu Cai, Jonathan Scarlett
Spotlight
Thu 20:35 Optimal Thompson Sampling strategies for support-aware CVaR bandits
Dorian Baudry, Romain Gautron, Emilie Kaufmann, Odalric-Ambrym Maillard
Spotlight
Thu 20:40 On Limited-Memory Subsampling Strategies for Bandits
Dorian Baudry, Yoan Russac, Olivier Cappé
Spotlight
Thu 20:45 Problem Dependent View on Structured Thresholding Bandit Problems
James Cheshire, Pierre MENARD, Alexandra Carpentier
Spotlight
Thu 20:45 CURI: A Benchmark for Productive Concept Learning Under Uncertainty
Rama Vedantam, Arthur Szlam, Max Nickel, Ari Morcos, Brenden Lake
Spotlight
Thu 20:50 Leveraging Good Representations in Linear Contextual Bandits
Matteo Papini, Andrea Tirinzoni, Marcello Restelli, Alessandro Lazaric, Matteo Pirotta
Poster
Thu 21:00 Meta-Thompson Sampling
Branislav Kveton, Mikhail Konobeev, Manzil Zaheer, Chih-wei Hsu, Martin Mladenov, Craig Boutilier, Csaba Szepesvari
Poster
Thu 21:00 On Lower Bounds for Standard and Robust Gaussian Process Bandit Optimization
Xu Cai, Jonathan Scarlett
Poster
Thu 21:00 Policy Caches with Successor Features
Mark Nemecek, Ron Parr
Poster
Thu 21:00 Leveraging Good Representations in Linear Contextual Bandits
Matteo Papini, Andrea Tirinzoni, Marcello Restelli, Alessandro Lazaric, Matteo Pirotta
Poster
Thu 21:00 CURI: A Benchmark for Productive Concept Learning Under Uncertainty
Rama Vedantam, Arthur Szlam, Max Nickel, Ari Morcos, Brenden Lake
Poster
Thu 21:00 Decentralized Single-Timescale Actor-Critic on Zero-Sum Two-Player Stochastic Games
Hongyi Guo, Zuyue Fu, Zhuoran Yang, Zhaoran Wang
Poster
Thu 21:00 Problem Dependent View on Structured Thresholding Bandit Problems
James Cheshire, Pierre MENARD, Alexandra Carpentier
Poster
Thu 21:00 Structured World Belief for Reinforcement Learning in POMDP
Gautam Singh, Skand Peri, Junghyun Kim, Hyunseok Kim, Sungjin Ahn
Poster
Thu 21:00 Differentially Private Sliced Wasserstein Distance
alain rakotomamonjy, Ralaivola Liva
Poster
Thu 21:00 On Limited-Memory Subsampling Strategies for Bandits
Dorian Baudry, Yoan Russac, Olivier Cappé
Poster
Thu 21:00 Near-Optimal Entrywise Anomaly Detection for Low-Rank Matrices with Sub-Exponential Noise
Vivek Farias, Andrew Li, Tianyi Peng
Poster
Thu 21:00 Optimal Thompson Sampling strategies for support-aware CVaR bandits
Dorian Baudry, Romain Gautron, Emilie Kaufmann, Odalric-Ambrym Maillard