Timezone: »

Inductive Biases, Invariances and Generalization in Reinforcement Learning
Anirudh Goyal · Rosemary Nan Ke · Stefan Bauer · Jane Wang · Theophane Weber · Fabio Viola · Bernhard Schölkopf · Stefan Bauer

Sat Jul 18 03:00 AM -- 03:00 AM (PDT) @ None
Event URL: https://biases-invariances-generalization.github.io/ »

One proposed solution towards the goal of designing machines that can extrapolate experience across environments and tasks, are inductive biases. Providing and starting algorithms with inductive biases might help to learn invariances e.g. a causal graph structure, which in turn will allow the agent to generalize across environments and tasks.
While some inductive biases are already available and correspond to common knowledge, one key requirement to learn inductive biases from data seems to be the possibility to perform and learn from interventions. This assumption is partially motivated by the accepted hypothesis in psychology about the need to experiment in order to discover causal relationships. This corresponds to an reinforcement learning environment, where the agent can discover causal factors through interventions and observing their effects.

We believe that one reason which has hampered progress on building intelligent agents is the limited availability of good inductive biases. Learning inductive biases from data is difficult since this corresponds to an interactive learning setting, which compared to classical regression or classification frameworks is far less understood e.g. even formal definitions of generalization in RL have not been developed. While Reinforcement Learning has already achieved impressive results, the sample complexity required to achieve consistently good performance is often prohibitively high. This has limited most RL to either games or settings where an accurate simulator is available. Another issue is that RL agents are often brittle in the face of even tiny changes to the environment (either visual or mechanistic changes) unseen in the training phase.

To build intuition for the scope of the generalization problem in RL, consider the task of training a robotic car mechanic that can diagnose and repair any problem with a car. Current methods are all insufficient in some respect -- on-policy policy gradient algorithms need to cycle through all possible broken cars on every single iteration, off-policy algorithms end up with a mess of instability due to perception and highly diverse data, and model-based methods may struggle to fully estimate a complex web of causality.

In our workshop we hope to explore research and new ideas on topics related to inductive biases, invariances and generalization, including:

- What are efficient ways to learn inductive biases from data?
- Which inductive biases are most suitable to achieve generalization?
- Can we make the problem of generalization in particular for RL more concrete and figure out standard terms for discussing the problem?
- Causality and generalization especially in RL
- Model-based RL and generalization.
- Sample Complexity in reinforcement learning.
- Can we create models that are robust visual environments, assuming all the underlying mechanics are the same. Should this count as generalization or transfer learning?
- Robustness to changes in the mechanics of the environment, such as scaling of rewards.
- Can we create a theoretical understanding of generalization in RL, and understand how it is related to the well developed ideas from statistical learning theory.
- in RL, the training data is collected by the agent and it is affected by the agent's policy.
Therefore, the training distribution is not a fixed distribution. How does this affect how we should think about generalization?

The question of generalization in reinforcement learning is essential to the field’s future both in theory and in practice. However there are still open questions about the right way to think about generalization in RL, the right way to formalize the problem, and the most important tasks. This workshop would help to address this issue by bringing together researchers from different backgrounds to discuss these challenges.

Sat 3:00 a.m. - 3:15 a.m. [iCal]
Opening remarks
Sat 3:15 a.m. - 4:30 a.m. [iCal]
Poster Session 1 (Poster Session)
Sat 4:30 a.m. - 5:00 a.m. [iCal]

Meta Gradient Reinforcement Learning

David Silver
Sat 5:00 a.m. - 5:10 a.m. [iCal]
QA for invited talk 1 Silver (10min QA)
David Silver
Sat 5:10 a.m. - 5:40 a.m. [iCal]

Multi-Domain Data Integration: From Observations to Mechanistic Insights

Caroline Uhler
Sat 5:40 a.m. - 5:50 a.m. [iCal]
QA for invited talk 2 Uhler (10min QA)
Caroline Uhler
Sat 5:50 a.m. - 6:05 a.m. [iCal]


Roberta Raileanu
Sat 6:15 a.m. - 7:30 a.m. [iCal]
Poster Session 2 (Poster Session)
Sat 7:30 a.m. - 8:10 a.m. [iCal]

Augmenting data to improve robustness – a blessing or a curse?

Fanny Yang
Sat 8:10 a.m. - 8:40 a.m. [iCal]

System 2 Priors

Yoshua Bengio
Sat 8:40 a.m. - 8:50 a.m. [iCal]
QA for invited talk 4 Bengio (10min QA)
Yoshua Bengio
Sat 8:50 a.m. - 9:05 a.m. [iCal]


Swapnil Asawa, Benjamin Eysenbach
Sat 9:05 a.m. - 9:15 a.m. [iCal]
QA for invited talk 3 Yang (10min QA)
Fanny Yang
Sat 9:15 a.m. - 9:45 a.m. [iCal]

A New RNN algorithm using the computational inductive bias of span independence

Martha White
Sat 9:45 a.m. - 9:55 a.m. [iCal]
QA for invited talk 5 White (10min QA)
Martha White
Sat 10:00 a.m. - 10:30 a.m. [iCal]

From skills to tasks: Reusing and generalizing knowledge for motor control

Nicolas Heess
Sat 10:30 a.m. - 10:40 a.m. [iCal]
QA for invited talk 6 Heess (10min QA)
Nicolas Heess
Sat 10:40 a.m. - 11:55 a.m. [iCal]
Poster Session 3 (Poster Session)
Sat 11:55 a.m. - 12:25 p.m. [iCal]

Statistical Complexity of RL and the use of regression

Mengdi Wang
Sat 12:25 p.m. - 12:35 p.m. [iCal]
QA for invited talk 7 Wang (10min QA)
Mengdi Wang
Sat 12:35 p.m. - 1:05 p.m. [iCal]

On the Theory of Policy Gradient Methods: Optimality, Generalization and Distribution Shift

Sham Kakade
Sat 1:05 p.m. - 1:15 p.m. [iCal]
QA for invited talk 8 Kakade (10min QA)
Sham Kakade
Sat 1:15 p.m. - 2:15 p.m. [iCal]
Panel Discussion Session (Discussion Panel)


Danijar Hafner


Jerry Zikun Chen


aajay3110 Ajay


Tianhe Yu


Kanika Madan


Nasim Rahaman


Abhinav Gupta


Robert Müller


Min Sun, Wei-Cheng Tseng


Tao Chen


Gerrit Schoettler


Ashwin Kalyan, Nirbhay Modhe


Chuan Wen


Arip Asadulaev


Karl Pertsch


Ramanan Sekar


Eugene Vinitsky


Purva Pruthi


Jialin Song


Gaurav Sukhatme


Henrik Marklund


Giancarlo Kerg


Michael Lutter


Ezgi Korkmaz


Aviral Kumar


Saurabh Kumar, Aviral Kumar


Ravi Chunduru


Theresa Eimer


Arnab Kumar Mondal


Pierre-Alexan Kamienny


Taylor Killian


Amy Zhang


Russell Mendonca


André Biedenkapp


Kartik Ahuja


Yash Nair


Daksh Idnani


Harshit Sikchi


Tengyu Ma, Zichuan Lin


Benjamin Eysenbach


Amy Zhang


Marin Vlastelica Pogancic


Chi Zhang


Silviu Pitis


Ilya Kostrikov

Author Information

Anirudh Goyal (Université de Montréal)
Rosemary Nan Ke (MILA, University of Montreal)

I am a PhD student at Mila, I am advised by Chris Pal and Yoshua Bengio. My research interest are efficient credit assignment, causal learning and model-based reinforcement learning. Here is my homepage https://nke001.github.io/

Stefan Bauer (Max Planck Institute for Intelligent Systems)
Jane Wang (DeepMind)
Theo Weber (DeepMind)
Fabio Viola (DeepMind)
Bernhard Schölkopf (MPI for Intelligent Systems Tübingen, Germany)

Bernhard Scholkopf received degrees in mathematics (London) and physics (Tubingen), and a doctorate in computer science from the Technical University Berlin. He has researched at AT&T Bell Labs, at GMD FIRST, Berlin, at the Australian National University, Canberra, and at Microsoft Research Cambridge (UK). In 2001, he was appointed scientific member of the Max Planck Society and director at the MPI for Biological Cybernetics; in 2010 he founded the Max Planck Institute for Intelligent Systems. For further information, see www.kyb.tuebingen.mpg.de/~bs.

Stefan Bauer (MPI for Intelligent Systems)

More from the Same Authors