Inductive Biases, Invariances and Generalization in Reinforcement Learning

Workshop

Inductive Biases, Invariances and Generalization in Reinforcement Learning

Anirudh Goyal · Rosemary Nan Ke · Jane Wang · Stefan Bauer · Theophane Weber · Fabio Viola · Bernhard Schölkopf · Stefan Bauer

Sat 18 Jul, 3 a.m. PDT

Keywords: Reinforcement Learning inductive bias generalization

[ Abstract ] Workshop Website

One proposed solution towards the goal of designing machines that can extrapolate experience across environments and tasks, are inductive biases. Providing and starting algorithms with inductive biases might help to learn invariances e.g. a causal graph structure, which in turn will allow the agent to generalize across environments and tasks.
While some inductive biases are already available and correspond to common knowledge, one key requirement to learn inductive biases from data seems to be the possibility to perform and learn from interventions. This assumption is partially motivated by the accepted hypothesis in psychology about the need to experiment in order to discover causal relationships. This corresponds to an reinforcement learning environment, where the agent can discover causal factors through interventions and observing their effects.

We believe that one reason which has hampered progress on building intelligent agents is the limited availability of good inductive biases. Learning inductive biases from data is difficult since this corresponds to an interactive learning setting, which compared to classical regression or classification frameworks is far less understood e.g. even formal definitions of generalization in RL have not been developed. While Reinforcement Learning has already achieved impressive results, the sample complexity required to achieve consistently good performance is often prohibitively high. This has limited most RL to either games or settings where an accurate simulator is available. Another issue is that RL agents are often brittle in the face of even tiny changes to the environment (either visual or mechanistic changes) unseen in the training phase.

To build intuition for the scope of the generalization problem in RL, consider the task of training a robotic car mechanic that can diagnose and repair any problem with a car. Current methods are all insufficient in some respect -- on-policy policy gradient algorithms need to cycle through all possible broken cars on every single iteration, off-policy algorithms end up with a mess of instability due to perception and highly diverse data, and model-based methods may struggle to fully estimate a complex web of causality.

In our workshop we hope to explore research and new ideas on topics related to inductive biases, invariances and generalization, including:

- What are efficient ways to learn inductive biases from data?
- Which inductive biases are most suitable to achieve generalization?
- Can we make the problem of generalization in particular for RL more concrete and figure out standard terms for discussing the problem?
- Causality and generalization especially in RL
- Model-based RL and generalization.
- Sample Complexity in reinforcement learning.
- Can we create models that are robust visual environments, assuming all the underlying mechanics are the same. Should this count as generalization or transfer learning?
- Robustness to changes in the mechanics of the environment, such as scaling of rewards.
- Can we create a theoretical understanding of generalization in RL, and understand how it is related to the well developed ideas from statistical learning theory.
- in RL, the training data is collected by the agent and it is affected by the agent's policy.
Therefore, the training distribution is not a fixed distribution. How does this affect how we should think about generalization?

The question of generalization in reinforcement learning is essential to the field’s future both in theory and in practice. However there are still open questions about the right way to think about generalization in RL, the right way to formalize the problem, and the most important tasks. This workshop would help to address this issue by bringing together researchers from different backgrounds to discuss these challenges.

Chat is not available.

Timezone: America/Los_Angeles

Schedule

Sat 3:00 a.m. - 3:15 a.m.	Opening remarks ( Opening remarks ) >	🔗
Sat 3:15 a.m. - 4:30 a.m.	Poster Session 1 ( Poster Session ) >	🔗
Sat 4:30 a.m. - 5:00 a.m.	Invited talk 1 Silver ( Talk ) > link SlidesLive Video Link	David Silver 🔗
Sat 5:00 a.m. - 5:10 a.m.	QA for invited talk 1 Silver ( 10min QA ) >	David Silver 🔗
Sat 5:10 a.m. - 5:40 a.m.	Invited talk 2 Uhler ( Talk ) > link SlidesLive Video Link	Caroline Uhler 🔗
Sat 5:40 a.m. - 5:50 a.m.	QA for invited talk 2 Uhler ( 10min QA ) >	Caroline Uhler 🔗
Sat 5:50 a.m. - 6:05 a.m.	Automatic Data Augmentation for Generalization in Reinforcement Learning ( Spotlight ) > link SlidesLive Video Link	Roberta Raileanu 🔗
Sat 6:15 a.m. - 7:30 a.m.	Poster Session 2 ( Poster Session ) >	🔗
Sat 7:30 a.m. - 8:10 a.m.	Invited talk 3 Yang ( Talk ) > link SlidesLive Video Link	Fanny Yang 🔗
Sat 8:10 a.m. - 8:40 a.m.	Invited talk 4 Bengio ( Talk ) > link SlidesLive Video Link	Yoshua Bengio 🔗
Sat 8:40 a.m. - 8:50 a.m.	QA for invited talk 4 Bengio ( 10min QA ) >	Yoshua Bengio 🔗
Sat 8:50 a.m. - 9:05 a.m.	Off-Dynamics Reinforcement Learning: Training for Transfer with Domain Classifiers ( Spotlight ) > link Link	Swapnil Asawa · Benjamin Eysenbach 🔗
Sat 9:05 a.m. - 9:15 a.m.	QA for invited talk 3 Yang ( 10min QA ) >	Fanny Yang 🔗
Sat 9:15 a.m. - 9:45 a.m.	Invited talk 5 White ( Talk ) > link SlidesLive Video Link	Martha White 🔗
Sat 9:45 a.m. - 9:55 a.m.	QA for invited talk 5 White ( 10min QA ) >	Martha White 🔗
Sat 10:00 a.m. - 10:30 a.m.	Invited talk 6 Heess ( Talk ) > link SlidesLive Video Link	Nicolas Heess 🔗
Sat 10:30 a.m. - 10:40 a.m.	QA for invited talk 6 Heess ( 10min QA ) >	Nicolas Heess 🔗
Sat 10:40 a.m. - 11:55 a.m.	Poster Session 3 ( Poster Session ) >	🔗
Sat 11:55 a.m. - 12:25 p.m.	Invited talk 7 Wang ( Talk ) > link SlidesLive Video Link	Mengdi Wang 🔗
Sat 12:25 p.m. - 12:35 p.m.	QA for invited talk 7 Wang ( 10min QA ) >	Mengdi Wang 🔗
Sat 12:35 p.m. - 1:05 p.m.	Invited talk 8 Kakade ( Talk ) > SlidesLive Video	Sham Kakade 🔗
Sat 1:05 p.m. - 1:15 p.m.	QA for invited talk 8 Kakade ( 10min QA ) >	Sham Kakade 🔗
Sat 1:15 p.m. - 2:15 p.m.	Panel Discussion Session ( Discussion Panel ) >	🔗
-	Evaluating Agents without Rewards ( Poster ) > link SlidesLive Video Link	Danijar Hafner 🔗
-	Reinforcement Learning Generalization with Surprise Minimization ( Poster ) > link SlidesLive Video Link	Jerry Zikun Chen 🔗
-	Learning Action Priors for Visuomotor transfer ( Poster ) > link SlidesLive Video Link	Anurag Ajay 🔗
-	MOPO: Model-based Offline Policy Optimization ( Poster ) > link SlidesLive Video Link	Tianhe (Kevin) Yu 🔗
-	Meta Attention Networks: Meta Learning Attention To Modulate Information Between Sparsely Interacting Recurrent Modules ( Poster ) > link SlidesLive Video Link	Kanika Madan 🔗
-	Spatially Structured Recurrent Modules ( Poster ) > link SlidesLive Video Link	Nasim Rahaman 🔗
-	Neural Dynamic Policies for End-to-End Sensorimotor Learning ( Poster ) > link SlidesLive Video Link	Abhinav Gupta 🔗
-	Watch your Weight Reinforcement Learning ( Poster ) > link SlidesLive Video Link	Robert Müller 🔗
-	HAT: Hierarchical Alternative Training for Long Range Policy Transfer ( Poster ) > link SlidesLive Video Link	Min Sun · Wei-Cheng Tseng 🔗
-	Learning to Learn from Failures Using Replay ( Poster ) > link SlidesLive Video Link	Tao Chen 🔗
-	Meta-Reinforcement Learning for Robotic Industrial Insertion Tasks ( Poster ) > link SlidesLive Video Link	Gerrit Schoettler 🔗
-	Bridging Worlds in Reinforcement Learning with Model-Advantage ( Poster ) > link SlidesLive Video Link	Ashwin Kalyan · Nirbhay Modhe 🔗
-	Fighting Copycat Agents in Behavioral Cloning From Multiple Observations ( Poster ) > link SlidesLive Video Link	Chuan Wen 🔗
-	Conditioning of Reinforcement Learning Agents and its Policy Regularization Application ( Poster (5 min) ) > link SlidesLive Video Link	Arip Asadulaev 🔗
-	Long-Horizon Visual Planning with Goal-Conditioned Hierarchical Predictors ( Poster ) > link SlidesLive Video Link	Karl Pertsch 🔗
-	Planning to Explore via Self-Supervised World Models ( Poster ) > link SlidesLive Video Link	Ramanan Sekar 🔗
-	Robust Reinforcement Learning using Adversarial Populations ( Poster ) > link SlidesLive Video Link	Eugene Vinitsky 🔗
-	Structure Mapping for Transferability of Causal Models ( Poster ) > link SlidesLive Video Link	Purva Pruthi 🔗
-	Efficient Imitation Learning with Local Trajectory Optimization ( Poster ) > link SlidesLive Video Link	Jialin Song 🔗
-	Efficient Adaptation for End-to-End Vision-Based Robotic Manipulation ( Poster ) > link SlidesLive Video Link	Gaurav Sukhatme 🔗
-	Exact (Then Approximate) Dynamic Programming for Deep Reinforcement Learning ( Poster ) > link SlidesLive Video Link	Henrik Marklund 🔗
-	Learning Long-term Dependencies Using Cognitive Inductive Biases in Self-attention RNNs ( Poster ) > link SlidesLive Video Link	Giancarlo Kerg 🔗
-	A Differentiable Newton Euler Algorithm for Multi-body Model Learning ( Poster ) > link SlidesLive Video Link	Michael Lutter 🔗
-	Nesterov Momentum Adversarial Perturbations in the Deep Reinforcement Learning Domain ( Poster ) > link SlidesLive Video Link	Ezgi Korkmaz 🔗
-	DisCor: Corrective Feedback in Reinforcement Learning via Distribution Correction ( Poster ) > link SlidesLive Video Link	Aviral Kumar 🔗
-	One Solution is Not All You Need: Few-Shot Extrapolation via Structured MaxEnt RL ( Poster ) > link SlidesLive Video Link	Saurabh Kumar · Aviral Kumar 🔗
-	Attention Option-Critic ( Poster ) > link SlidesLive Video Link	Raviteja Chunduru 🔗
-	Towards Self-Paced Context Evaluation for Contextual Reinforcement Learning ( Poster ) > link SlidesLive Video Link	Theresa Eimer 🔗
-	Group Equivariant Deep Reinforcement Learning ( Poster ) > link SlidesLive Video Link	Arnab Kumar Mondal 🔗
-	Probing Dynamic Environments with Informed Policy Regularization ( Poster ) > link SlidesLive Video Link	Pierre-Alexandre Kamienny 🔗
-	Counterfactual Transfer via Inductive Bias in Clinical Settings ( Poster ) > link SlidesLive Video Link	Taylor W. Killian 🔗
-	Learning Invariant Representations for Reinforcement Learning without Reconstruction ( Poster ) > link SlidesLive Video Link	Amy Zhang 🔗
-	Meta-Reinforcement Learning Robust to Distributional Shift via Model Identification and Experience Relabeling ( Poster ) > link SlidesLive Video Link	Russell Mendonca 🔗
-	Towards TempoRL: Learning When to Act ( Poster ) > link SlidesLive Video Link	André Biedenkapp 🔗
-	On the Equivalence of Bi-Level Optimization and Game-Theoretic Formulations of Invariant Risk Minimization ( Poster ) > link SlidesLive Video Link	Kartik Ahuja 🔗
-	PAC Imitation and Model-based Batch Learning of Contextual MDPs ( Poster ) > link SlidesLive Video Link	Yash Nair 🔗
-	Learning Robust Representations with Score Invariant Learning ( Poster ) > link SlidesLive Video Link	Daksh Idnani 🔗
-	Learning Off-Policy with Online Planning ( Poster ) > link SlidesLive Video Link	Harshit Sikchi 🔗
-	Model-based Adversarial Meta-Reinforcement Learning ( Poster ) > link SlidesLive Video Link	Tengyu Ma · Zichuan Lin 🔗
-	If MaxEnt RL is the Answer, What is the Question? ( Poster ) > link SlidesLive Video Link	Benjamin Eysenbach 🔗
-	Multi-Task Reinforcement Learning as a Hidden-Parameter Block MDP ( Poster ) > link SlidesLive Video Link	Amy Zhang 🔗
-	Discrete Planning with End-to-end Trained Neuro-algorithmic Policies ( Poster ) > link SlidesLive Video Link	Marin Vlastelica 🔗
-	Maximum Entropy Model Rollouts: Fast Model Based Policy Optimization without Compounding Errors ( Poster ) > link SlidesLive Video Link	Chi Zhang 🔗
-	Counterfactual Data Augmentation using Locally Factored Dynamics ( Poster ) > link SlidesLive Video Link	Silviu Pitis 🔗
-	Image Augmentation Is All You Need: Regularizing Deep Reinforcement Learning from Pixels ( Poster ) > link SlidesLive Video Link	Ilya Kostrikov 🔗