Keywords: Reinforcement Learning inductive bias generalization
One proposed solution towards the goal of designing machines that can extrapolate experience across environments and tasks, are inductive biases. Providing and starting algorithms with inductive biases might help to learn invariances e.g. a causal graph structure, which in turn will allow the agent to generalize across environments and tasks.
While some inductive biases are already available and correspond to common knowledge, one key requirement to learn inductive biases from data seems to be the possibility to perform and learn from interventions. This assumption is partially motivated by the accepted hypothesis in psychology about the need to experiment in order to discover causal relationships. This corresponds to an reinforcement learning environment, where the agent can discover causal factors through interventions and observing their effects.
We believe that one reason which has hampered progress on building intelligent agents is the limited availability of good inductive biases. Learning inductive biases from data is difficult since this corresponds to an interactive learning setting, which compared to classical regression or classification frameworks is far less understood e.g. even formal definitions of generalization in RL have not been developed. While Reinforcement Learning has already achieved impressive results, the sample complexity required to achieve consistently good performance is often prohibitively high. This has limited most RL to either games or settings where an accurate simulator is available. Another issue is that RL agents are often brittle in the face of even tiny changes to the environment (either visual or mechanistic changes) unseen in the training phase.
To build intuition for the scope of the generalization problem in RL, consider the task of training a robotic car mechanic that can diagnose and repair any problem with a car. Current methods are all insufficient in some respect -- on-policy policy gradient algorithms need to cycle through all possible broken cars on every single iteration, off-policy algorithms end up with a mess of instability due to perception and highly diverse data, and model-based methods may struggle to fully estimate a complex web of causality.
In our workshop we hope to explore research and new ideas on topics related to inductive biases, invariances and generalization, including:
- What are efficient ways to learn inductive biases from data?
- Which inductive biases are most suitable to achieve generalization?
- Can we make the problem of generalization in particular for RL more concrete and figure out standard terms for discussing the problem?
- Causality and generalization especially in RL
- Model-based RL and generalization.
- Sample Complexity in reinforcement learning.
- Can we create models that are robust visual environments, assuming all the underlying mechanics are the same. Should this count as generalization or transfer learning?
- Robustness to changes in the mechanics of the environment, such as scaling of rewards.
- Can we create a theoretical understanding of generalization in RL, and understand how it is related to the well developed ideas from statistical learning theory.
- in RL, the training data is collected by the agent and it is affected by the agent's policy.
Therefore, the training distribution is not a fixed distribution. How does this affect how we should think about generalization?
The question of generalization in reinforcement learning is essential to the field’s future both in theory and in practice. However there are still open questions about the right way to think about generalization in RL, the right way to formalize the problem, and the most important tasks. This workshop would help to address this issue by bringing together researchers from different backgrounds to discuss these challenges.
Sat 3:00 a.m. - 3:15 a.m.
|
Opening remarks
|
🔗 |
Sat 3:15 a.m. - 4:30 a.m.
|
Poster Session 1
(Poster Session)
|
🔗 |
Sat 4:30 a.m. - 5:00 a.m.
|
Invited talk 1 Silver
(Talk)
link »
SlidesLive Video » Meta Gradient Reinforcement Learning |
David Silver 🔗 |
Sat 5:00 a.m. - 5:10 a.m.
|
QA for invited talk 1 Silver
(10min QA)
|
David Silver 🔗 |
Sat 5:10 a.m. - 5:40 a.m.
|
Invited talk 2 Uhler
(Talk)
link »
SlidesLive Video » Multi-Domain Data Integration: From Observations to Mechanistic Insights |
Caroline Uhler 🔗 |
Sat 5:40 a.m. - 5:50 a.m.
|
QA for invited talk 2 Uhler
(10min QA)
|
Caroline Uhler 🔗 |
Sat 5:50 a.m. - 6:05 a.m.
|
Automatic Data Augmentation for Generalization in Reinforcement Learning
(Spotlight)
link »
SlidesLive Video » http://slideslive.com/38931360 |
Roberta Raileanu 🔗 |
Sat 6:15 a.m. - 7:30 a.m.
|
Poster Session 2
(Poster Session)
|
🔗 |
Sat 7:30 a.m. - 8:10 a.m.
|
Invited talk 3 Yang
(Talk)
link »
SlidesLive Video » Augmenting data to improve robustness – a blessing or a curse? |
Fanny Yang 🔗 |
Sat 8:10 a.m. - 8:40 a.m.
|
Invited talk 4 Bengio
(Talk)
link »
SlidesLive Video » System 2 Priors |
Yoshua Bengio 🔗 |
Sat 8:40 a.m. - 8:50 a.m.
|
QA for invited talk 4 Bengio
(10min QA)
|
Yoshua Bengio 🔗 |
Sat 8:50 a.m. - 9:05 a.m.
|
Off-Dynamics Reinforcement Learning: Training for Transfer with Domain Classifiers
(Spotlight)
link »
http://slideslive.com/38931338 |
Swapnil Asawa · Benjamin Eysenbach 🔗 |
Sat 9:05 a.m. - 9:15 a.m.
|
QA for invited talk 3 Yang
(10min QA)
|
Fanny Yang 🔗 |
Sat 9:15 a.m. - 9:45 a.m.
|
Invited talk 5 White
(Talk)
link »
SlidesLive Video » A New RNN algorithm using the computational inductive bias of span independence |
Martha White 🔗 |
Sat 9:45 a.m. - 9:55 a.m.
|
QA for invited talk 5 White
(10min QA)
|
Martha White 🔗 |
Sat 10:00 a.m. - 10:30 a.m.
|
Invited talk 6 Heess
(Talk)
link »
SlidesLive Video » From skills to tasks: Reusing and generalizing knowledge for motor control |
Nicolas Heess 🔗 |
Sat 10:30 a.m. - 10:40 a.m.
|
QA for invited talk 6 Heess
(10min QA)
|
Nicolas Heess 🔗 |
Sat 10:40 a.m. - 11:55 a.m.
|
Poster Session 3
(Poster Session)
|
🔗 |
Sat 11:55 a.m. - 12:25 p.m.
|
Invited talk 7 Wang
(Talk)
link »
SlidesLive Video » Statistical Complexity of RL and the use of regression |
Mengdi Wang 🔗 |
Sat 12:25 p.m. - 12:35 p.m.
|
QA for invited talk 7 Wang
(10min QA)
|
Mengdi Wang 🔗 |
Sat 12:35 p.m. - 1:05 p.m.
|
Invited talk 8 Kakade
(Talk)
SlidesLive Video » On the Theory of Policy Gradient Methods: Optimality, Generalization and Distribution Shift |
Sham Kakade 🔗 |
Sat 1:05 p.m. - 1:15 p.m.
|
QA for invited talk 8 Kakade
(10min QA)
|
Sham Kakade 🔗 |
Sat 1:15 p.m. - 2:15 p.m.
|
Panel Discussion Session
(Discussion Panel)
|
🔗 |
-
|
Evaluating Agents without Rewards
(Poster)
link »
SlidesLive Video » http://slideslive.com/38931301 |
Danijar Hafner 🔗 |
-
|
Reinforcement Learning Generalization with Surprise Minimization
(Poster)
link »
SlidesLive Video » http://slideslive.com/38931304 |
Jerry Zikun Chen 🔗 |
-
|
Learning Action Priors for Visuomotor transfer
(Poster)
link »
SlidesLive Video » http://slideslive.com/38931326 |
Anurag Ajay 🔗 |
-
|
MOPO: Model-based Offline Policy Optimization
(Poster)
link »
SlidesLive Video » http://slideslive.com/38931302 |
Tianhe (Kevin) Yu 🔗 |
-
|
Meta Attention Networks: Meta Learning Attention To Modulate Information Between Sparsely Interacting Recurrent Modules
(Poster)
link »
SlidesLive Video » http://slideslive.com/38931327 |
Kanika Madan 🔗 |
-
|
Spatially Structured Recurrent Modules
(Poster)
link »
SlidesLive Video » http://slideslive.com/38931303 |
Nasim Rahaman 🔗 |
-
|
Neural Dynamic Policies for End-to-End Sensorimotor Learning
(Poster)
link »
SlidesLive Video » http://slideslive.com/38931329 |
Abhinav Gupta 🔗 |
-
|
Watch your Weight Reinforcement Learning
(Poster)
link »
SlidesLive Video » http://slideslive.com/38931330 |
Robert Müller 🔗 |
-
|
HAT: Hierarchical Alternative Training for Long Range Policy Transfer
(Poster)
link »
SlidesLive Video » http://slideslive.com/38931331 |
Min Sun · Wei-Cheng Tseng 🔗 |
-
|
Learning to Learn from Failures Using Replay
(Poster)
link »
SlidesLive Video » http://slideslive.com/38931328 |
Tao Chen 🔗 |
-
|
Meta-Reinforcement Learning for Robotic Industrial Insertion Tasks
(Poster)
link »
SlidesLive Video » http://slideslive.com/38931362 |
Gerrit Schoettler 🔗 |
-
|
Bridging Worlds in Reinforcement Learning with Model-Advantage
(Poster)
link »
SlidesLive Video » http://slideslive.com/38931333 |
Ashwin Kalyan · Nirbhay Modhe 🔗 |
-
|
Fighting Copycat Agents in Behavioral Cloning From Multiple Observations
(Poster)
link »
SlidesLive Video » http://slideslive.com/38931530 |
Chuan Wen 🔗 |
-
|
Conditioning of Reinforcement Learning Agents and its Policy Regularization Application
(Poster (5 min))
link »
SlidesLive Video » http://slideslive.com/38931436 |
Arip Asadulaev 🔗 |
-
|
Long-Horizon Visual Planning with Goal-Conditioned Hierarchical Predictors
(Poster)
link »
SlidesLive Video » http://slideslive.com/38931332 |
Karl Pertsch 🔗 |
-
|
Planning to Explore via Self-Supervised World Models
(Poster)
link »
SlidesLive Video » http://slideslive.com/38931334 |
Ramanan Sekar 🔗 |
-
|
Robust Reinforcement Learning using Adversarial Populations
(Poster)
link »
SlidesLive Video » http://slideslive.com/38931336 |
Eugene Vinitsky 🔗 |
-
|
Structure Mapping for Transferability of Causal Models
(Poster)
link »
SlidesLive Video » http://slideslive.com/38931337 |
Purva Pruthi 🔗 |
-
|
Efficient Imitation Learning with Local Trajectory Optimization
(Poster)
link »
SlidesLive Video » http://slideslive.com/38931339 |
Jialin Song 🔗 |
-
|
Efficient Adaptation for End-to-End Vision-Based Robotic Manipulation
(Poster)
link »
SlidesLive Video » http://slideslive.com/38931340 |
Gaurav Sukhatme 🔗 |
-
|
Exact (Then Approximate) Dynamic Programming for Deep Reinforcement Learning
(Poster)
link »
SlidesLive Video » http://slideslive.com/38931341 |
Henrik Marklund 🔗 |
-
|
Learning Long-term Dependencies Using Cognitive Inductive Biases in Self-attention RNNs
(Poster)
link »
SlidesLive Video » http://slideslive.com/38931335 |
Giancarlo Kerg 🔗 |
-
|
A Differentiable Newton Euler Algorithm for Multi-body Model Learning
(Poster)
link »
SlidesLive Video » http://slideslive.com/38931342 |
Michael Lutter 🔗 |
-
|
Nesterov Momentum Adversarial Perturbations in the Deep Reinforcement Learning Domain
(Poster)
link »
SlidesLive Video » http://slideslive.com/38931437 |
Ezgi Korkmaz 🔗 |
-
|
DisCor: Corrective Feedback in Reinforcement Learning via Distribution Correction
(Poster)
link »
SlidesLive Video » http://slideslive.com/38931343 |
Aviral Kumar 🔗 |
-
|
One Solution is Not All You Need: Few-Shot Extrapolation via Structured MaxEnt RL
(Poster)
link »
SlidesLive Video » http://slideslive.com/38931345 |
Saurabh Kumar · Aviral Kumar 🔗 |
-
|
Attention Option-Critic
(Poster)
link »
SlidesLive Video » http://slideslive.com/38931346 |
Raviteja Chunduru 🔗 |
-
|
Towards Self-Paced Context Evaluation for Contextual Reinforcement Learning
(Poster)
link »
SlidesLive Video » http://slideslive.com/38931344 |
Theresa Eimer 🔗 |
-
|
Group Equivariant Deep Reinforcement Learning
(Poster)
link »
SlidesLive Video » http://slideslive.com/38931347 |
Arnab Kumar Mondal 🔗 |
-
|
Probing Dynamic Environments with Informed Policy Regularization
(Poster)
link »
SlidesLive Video » http://slideslive.com/38931348 |
Pierre-Alexandre Kamienny 🔗 |
-
|
Counterfactual Transfer via Inductive Bias in Clinical Settings
(Poster)
link »
SlidesLive Video » http://slideslive.com/38931438 |
Taylor Killian 🔗 |
-
|
Learning Invariant Representations for Reinforcement Learning without Reconstruction
(Poster)
link »
SlidesLive Video » http://slideslive.com/38931349 |
Amy Zhang 🔗 |
-
|
Meta-Reinforcement Learning Robust to Distributional Shift via Model Identification and Experience Relabeling
(Poster)
link »
SlidesLive Video » http://slideslive.com/38931350 |
Russell Mendonca 🔗 |
-
|
Towards TempoRL: Learning When to Act
(Poster)
link »
SlidesLive Video » http://slideslive.com/38931351 |
André Biedenkapp 🔗 |
-
|
On the Equivalence of Bi-Level Optimization and Game-Theoretic Formulations of Invariant Risk Minimization
(Poster)
link »
SlidesLive Video » http://slideslive.com/38931353 |
Kartik Ahuja 🔗 |
-
|
PAC Imitation and Model-based Batch Learning of Contextual MDPs
(Poster)
link »
SlidesLive Video » http://slideslive.com/38931354 |
Yash Nair 🔗 |
-
|
Learning Robust Representations with Score Invariant Learning
(Poster)
link »
SlidesLive Video » http://slideslive.com/38931352 |
Daksh Idnani 🔗 |
-
|
Learning Off-Policy with Online Planning
(Poster)
link »
SlidesLive Video » http://slideslive.com/38931439 |
Harshit Sikchi 🔗 |
-
|
Model-based Adversarial Meta-Reinforcement Learning
(Poster)
link »
SlidesLive Video » http://slideslive.com/38931355 |
Tengyu Ma · Zichuan Lin 🔗 |
-
|
If MaxEnt RL is the Answer, What is the Question?
(Poster)
link »
SlidesLive Video » http://slideslive.com/38931440 |
Benjamin Eysenbach 🔗 |
-
|
Multi-Task Reinforcement Learning as a Hidden-Parameter Block MDP
(Poster)
link »
SlidesLive Video » http://slideslive.com/38931356 |
Amy Zhang 🔗 |
-
|
Discrete Planning with End-to-end Trained Neuro-algorithmic Policies
(Poster)
link »
SlidesLive Video » http://slideslive.com/38931357 |
Marin Vlastelica 🔗 |
-
|
Maximum Entropy Model Rollouts: Fast Model Based Policy Optimization without Compounding Errors
(Poster)
link »
SlidesLive Video » http://slideslive.com/38931358 |
Chi Zhang 🔗 |
-
|
Counterfactual Data Augmentation using Locally Factored Dynamics
(Poster)
link »
SlidesLive Video » http://slideslive.com/38931359 |
Silviu Pitis 🔗 |
-
|
Image Augmentation Is All You Need: Regularizing Deep Reinforcement Learning from Pixels
(Poster)
link »
SlidesLive Video » http://slideslive.com/38931361 |
Ilya Kostrikov 🔗 |