Timezone: »

 
Workshop
Inductive Biases, Invariances and Generalization in Reinforcement Learning
Anirudh Goyal · Rosemary Nan Ke · Stefan Bauer · Jane Wang · Theophane Weber · Fabio Viola · Bernhard Schölkopf · Stefan Bauer

Sat Jul 18 03:00 AM -- 03:00 AM (PDT) @ None
Event URL: https://biases-invariances-generalization.github.io/ »

One proposed solution towards the goal of designing machines that can extrapolate experience across environments and tasks, are inductive biases. Providing and starting algorithms with inductive biases might help to learn invariances e.g. a causal graph structure, which in turn will allow the agent to generalize across environments and tasks.
While some inductive biases are already available and correspond to common knowledge, one key requirement to learn inductive biases from data seems to be the possibility to perform and learn from interventions. This assumption is partially motivated by the accepted hypothesis in psychology about the need to experiment in order to discover causal relationships. This corresponds to an reinforcement learning environment, where the agent can discover causal factors through interventions and observing their effects.

We believe that one reason which has hampered progress on building intelligent agents is the limited availability of good inductive biases. Learning inductive biases from data is difficult since this corresponds to an interactive learning setting, which compared to classical regression or classification frameworks is far less understood e.g. even formal definitions of generalization in RL have not been developed. While Reinforcement Learning has already achieved impressive results, the sample complexity required to achieve consistently good performance is often prohibitively high. This has limited most RL to either games or settings where an accurate simulator is available. Another issue is that RL agents are often brittle in the face of even tiny changes to the environment (either visual or mechanistic changes) unseen in the training phase.

To build intuition for the scope of the generalization problem in RL, consider the task of training a robotic car mechanic that can diagnose and repair any problem with a car. Current methods are all insufficient in some respect -- on-policy policy gradient algorithms need to cycle through all possible broken cars on every single iteration, off-policy algorithms end up with a mess of instability due to perception and highly diverse data, and model-based methods may struggle to fully estimate a complex web of causality.

In our workshop we hope to explore research and new ideas on topics related to inductive biases, invariances and generalization, including:

- What are efficient ways to learn inductive biases from data?
- Which inductive biases are most suitable to achieve generalization?
- Can we make the problem of generalization in particular for RL more concrete and figure out standard terms for discussing the problem?
- Causality and generalization especially in RL
- Model-based RL and generalization.
- Sample Complexity in reinforcement learning.
- Can we create models that are robust visual environments, assuming all the underlying mechanics are the same. Should this count as generalization or transfer learning?
- Robustness to changes in the mechanics of the environment, such as scaling of rewards.
- Can we create a theoretical understanding of generalization in RL, and understand how it is related to the well developed ideas from statistical learning theory.
- in RL, the training data is collected by the agent and it is affected by the agent's policy.
Therefore, the training distribution is not a fixed distribution. How does this affect how we should think about generalization?

The question of generalization in reinforcement learning is essential to the field’s future both in theory and in practice. However there are still open questions about the right way to think about generalization in RL, the right way to formalize the problem, and the most important tasks. This workshop would help to address this issue by bringing together researchers from different backgrounds to discuss these challenges.

Sat 3:00 a.m. - 3:15 a.m. [iCal]
Opening remarks
Sat 3:15 a.m. - 4:30 a.m. [iCal]
Poster Session 1 (Poster Session)
Sat 4:30 a.m. - 5:00 a.m. [iCal]

Meta Gradient Reinforcement Learning

David Silver
Sat 5:00 a.m. - 5:10 a.m. [iCal]
QA for invited talk 1 Silver (10min QA)
David Silver
Sat 5:10 a.m. - 5:40 a.m. [iCal]

Multi-Domain Data Integration: From Observations to Mechanistic Insights

Caroline Uhler
Sat 5:40 a.m. - 5:50 a.m. [iCal]
QA for invited talk 2 Uhler (10min QA)
Caroline Uhler
Sat 5:50 a.m. - 6:05 a.m. [iCal]

http://slideslive.com/38931360

Roberta Raileanu
Sat 6:15 a.m. - 7:30 a.m. [iCal]
Poster Session 2 (Poster Session)
Sat 7:30 a.m. - 8:10 a.m. [iCal]

Augmenting data to improve robustness – a blessing or a curse?

Fanny Yang
Sat 8:10 a.m. - 8:40 a.m. [iCal]

System 2 Priors

Yoshua Bengio
Sat 8:40 a.m. - 8:50 a.m. [iCal]
QA for invited talk 4 Bengio (10min QA)
Yoshua Bengio
Sat 8:50 a.m. - 9:05 a.m. [iCal]

http://slideslive.com/38931338

Swapnil Asawa, Benjamin Eysenbach
Sat 9:05 a.m. - 9:15 a.m. [iCal]
QA for invited talk 3 Yang (10min QA)
Fanny Yang
Sat 9:15 a.m. - 9:45 a.m. [iCal]

A New RNN algorithm using the computational inductive bias of span independence

Martha White
Sat 9:45 a.m. - 9:55 a.m. [iCal]
QA for invited talk 5 White (10min QA)
Martha White
Sat 10:00 a.m. - 10:30 a.m. [iCal]

From skills to tasks: Reusing and generalizing knowledge for motor control

Nicolas Heess
Sat 10:30 a.m. - 10:40 a.m. [iCal]
QA for invited talk 6 Heess (10min QA)
Nicolas Heess
Sat 10:40 a.m. - 11:55 a.m. [iCal]
Poster Session 3 (Poster Session)
Sat 11:55 a.m. - 12:25 p.m. [iCal]

Statistical Complexity of RL and the use of regression

Mengdi Wang
Sat 12:25 p.m. - 12:35 p.m. [iCal]
QA for invited talk 7 Wang (10min QA)
Mengdi Wang
Sat 12:35 p.m. - 1:05 p.m. [iCal]

On the Theory of Policy Gradient Methods: Optimality, Generalization and Distribution Shift

Sham Kakade
Sat 1:05 p.m. - 1:15 p.m. [iCal]
QA for invited talk 8 Kakade (10min QA)
Sham Kakade
Sat 1:15 p.m. - 2:15 p.m. [iCal]
Panel Discussion Session (Discussion Panel)
-

http://slideslive.com/38931301

Danijar Hafner
-

http://slideslive.com/38931304

Jerry Zikun Chen
-

http://slideslive.com/38931326

aajay3110 Ajay
-

http://slideslive.com/38931302

Tianhe Yu
-

http://slideslive.com/38931327

Kanika Madan
-

http://slideslive.com/38931303

Nasim Rahaman
-

http://slideslive.com/38931329

Abhinav Gupta
-

http://slideslive.com/38931330

Robert Müller
-

http://slideslive.com/38931331

Min Sun, Wei-Cheng Tseng
-

http://slideslive.com/38931328

Tao Chen
-

http://slideslive.com/38931362

Gerrit Schoettler
-

http://slideslive.com/38931333

Ashwin Kalyan, Nirbhay Modhe
-

http://slideslive.com/38931530

Chuan Wen
-

http://slideslive.com/38931436

Arip Asadulaev
-

http://slideslive.com/38931332

Karl Pertsch
-

http://slideslive.com/38931334

Ramanan Sekar
-

http://slideslive.com/38931336

Eugene Vinitsky
-

http://slideslive.com/38931337

Purva Pruthi
-

http://slideslive.com/38931339

Jialin Song
-

http://slideslive.com/38931340

Gaurav Sukhatme
-

http://slideslive.com/38931341

Henrik Marklund
-

http://slideslive.com/38931335

Giancarlo Kerg
-

http://slideslive.com/38931342

Michael Lutter
-

http://slideslive.com/38931437

Ezgi Korkmaz
-

http://slideslive.com/38931343

Aviral Kumar
-

http://slideslive.com/38931345

Saurabh Kumar, Aviral Kumar
-

http://slideslive.com/38931346

Ravi Chunduru
-

http://slideslive.com/38931344

Theresa Eimer
-

http://slideslive.com/38931347

Arnab Kumar Mondal
-

http://slideslive.com/38931348

Pierre-Alexan Kamienny
-

http://slideslive.com/38931438

Taylor Killian
-

http://slideslive.com/38931349

Amy Zhang
-

http://slideslive.com/38931350

Russell Mendonca
-

http://slideslive.com/38931351

André Biedenkapp
-

http://slideslive.com/38931353

Kartik Ahuja
-

http://slideslive.com/38931354

Yash Nair
-

http://slideslive.com/38931352

Daksh Idnani
-

http://slideslive.com/38931439

Harshit Sikchi
-

http://slideslive.com/38931355

Tengyu Ma, Zichuan Lin
-

http://slideslive.com/38931440

Benjamin Eysenbach
-

http://slideslive.com/38931356

Amy Zhang
-

http://slideslive.com/38931357

Marin Vlastelica Pogancic
-

http://slideslive.com/38931358

Chi Zhang
-

http://slideslive.com/38931359

Silviu Pitis
-

http://slideslive.com/38931361

Ilya Kostrikov

Author Information

Anirudh Goyal (Université de Montréal)
Rosemary Nan Ke (MILA, University of Montreal)

I am a PhD student at Mila, I am advised by Chris Pal and Yoshua Bengio. My research interest are efficient credit assignment, causal learning and model-based reinforcement learning. Here is my homepage https://nke001.github.io/

Stefan Bauer (Max Planck Institute for Intelligent Systems)
Jane Wang (DeepMind)
Theo Weber (DeepMind)
Fabio Viola (DeepMind)
Bernhard Schölkopf (MPI for Intelligent Systems Tübingen, Germany)

Bernhard Scholkopf received degrees in mathematics (London) and physics (Tubingen), and a doctorate in computer science from the Technical University Berlin. He has researched at AT&T Bell Labs, at GMD FIRST, Berlin, at the Australian National University, Canberra, and at Microsoft Research Cambridge (UK). In 2001, he was appointed scientific member of the Max Planck Society and director at the MPI for Biological Cybernetics; in 2010 he founded the Max Planck Institute for Intelligent Systems. For further information, see www.kyb.tuebingen.mpg.de/~bs.

Stefan Bauer (MPI for Intelligent Systems)

More from the Same Authors