Timezone: »

4th Lifelong Learning Workshop
Shagun Sodhani · Sarath Chandar · Balaraman Ravindran · Doina Precup

Sat Jul 18 02:00 AM -- 06:00 PM (PDT) @ None
Event URL: https://lifelongml.github.io/ »

One of the most significant and challenging open problems in Artificial Intelligence (AI) is the problem of Lifelong Learning. Lifelong Machine Learning considers systems that can continually learn many tasks (from one or more domains) over a lifetime. A lifelong learning system efficiently and effectively:

1. retains the knowledge it has learned from different tasks;

2. selectively transfers knowledge (from previously learned tasks) to facilitate the learning of new tasks;

3. ensures the effective and efficient interaction between (1) and (2).

Lifelong Learning introduces several fundamental challenges in training models that generally do not arise in a single task batch learning setting. This includes problems like catastrophic forgetting and capacity saturation. This workshop aims to explore solutions for these problems in both supervised learning and reinforcement learning settings.

Sat 2:00 a.m. - 2:15 a.m.

Opening Remarks (Introduction and Overview)

Sarath Chandar, Shagun Sodhani
Sat 2:15 a.m. - 2:45 a.m.

Lifelong reinforcement learning holds promise to enable autonomous decision making in applications ranging from household robotics to autonomous space exploration. In this talk we will discuss key challenges that need to be address to make progress towards this grand vision. First, we discuss the fundamental need for the community to invest in a shared understanding of problem formalizations and evaluation paradigms and benchmarks. Second, we dive deeper into ongoing work towards tackling exploration and representation in lifelong reinforcement learning. Our aim is to spark debate and inspire research in this exciting space.

Katja Hofmann, Rika Antonova, Luisa Zintgraf
Sat 2:45 a.m. - 3:00 a.m.
Q&A with Katja Hoffman (Q&A)
Katja Hofmann, Luisa Zintgraf, Rika Antonova, Sarath Chandar, Shagun Sodhani
Sat 3:00 a.m. - 3:30 a.m.

Humans learn many different functions and skills, from diverse experiences gained over many years, from a staged curriculum in which they first learn easier and later more difficult tasks, retain the learned knowledge and skills, which are used in subsequent learning to make it easier or more effective. Furthermore, humans self-reflect on their evolving skills, choose new learning tasks over time, teach one another, learn new representations, read books, discuss competing hypotheses, and more. In this talk, I shall share my thoughts on the question of how to design machine learning agents with similar capabilities.

Partha Talukdar
Sat 3:30 a.m. - 3:45 a.m.
Q&A with Partha Pratim Talukdar (Q&A)
Partha Talukdar, Shagun Sodhani, Sarath Chandar
Sat 3:45 a.m. - 4:00 a.m.
Contributed Talk: Continual Deep Learning by Functional Regularisation of Memorable Past (Talk)   
Siddharth Swaroop
Sat 4:00 a.m. - 6:00 a.m.
Sat 6:00 a.m. - 6:30 a.m.

Most current artificial reinforcement learning (RL) agents are trained under the assumption of repeatable trials, and are reset at the beginning of each trial. Humans, however, are never reset. Instead, they are allowed to discover computable patterns across trials, e.g.: in every third trial, go left to obtain reward, otherwise go right. General RL (sometimes called AGI) must assume a single lifelong trial which may or may not include identifiable sub-trials. General RL must also explicitly take into account that policy changes in early life may affect properties of later sub-trials and policy changes. In particular, General RL must take into account recursively that early meta-meta-learning is setting the stage for later meta-learning which is setting the stage for later learning etc. Most popular RL mechanisms, however, ignore such lifelong credit assignment chains. Exceptions are the success story algorithm (1990s), AIXI (2000s), and the mathematically optimal Gödel Machine (2003).

Jürgen Schmidhuber
Sat 6:30 a.m. - 6:45 a.m.
Q&A with Jürgen Schmidhuber (Q&A)
Jürgen Schmidhuber, Shagun Sodhani, Sarath Chandar
Sat 6:45 a.m. - 7:00 a.m.
Contributed Talk: Combining Variational Continual Learning with FiLM Layers (Talk)   
Noel Loo
Sat 7:00 a.m. - 7:15 a.m.
Contributed Talk: Wandering Within a World: Online Contextualized Few-Shot Learning (Talk)
Mengye Ren
Sat 7:15 a.m. - 7:45 a.m.

Modern AI systems have achieved impressive results in many specific domains, from image and speech recognition to natural language processing and mastering complex games such as chess and Go. However, they remain largely inflexible, fragile and narrow, unable to continually adapt to a wide range of changing environments and novel tasks without "catastrophically forgetting" what they have learned before, to infer higher-order abstractions allowing for systematic generalization to out-of-distribution data, and to achieve the level of robustness necessary to "survive" various perturbations in their environment - a natural property of most biological intelligent systems. In this talk, we will provide a brief overview of advances in continual learning (CL) field [1] which aims to push AI from "narrow" to "broad", from unsupervised adaptive ("neurogenetic") architectural adaptations [2] to a recent general supervised CL framework for quickly solving new, out-of-distribution tasks, combined with fast remembering of the previous ones; it unifies continual-, meta-, meta-continual-, and continual-meta learning and introduces continual-MAML, an online extension of the popular MAML algorithm [3]. Furthermore, we present a brief overview of the most challenging setting - continual RL, characterized by dynamic, non-stationary environment, and discuss open problems and challenges in bridging the gap between the current state of continual RL and better incremental reinforcement learners that can function in increasingly human realistic learning environments [4]. Next, we address the robust representation learning problem, i.e. extracting features invariant to various stochastic and/or adversarial perturbations of the environment - a common goal across continual-, meta-, transfer learning as well as adversarial robustness, out-of-distribution generalization, self-supervised learning, and related subfields. As an example, our recent Adversarial Feature Desensitization (AFD) approach [5] trains a feature extractor network to generate representations which are both predictive and robust to input perturbations (e.g. adversarial attacks) and demonstrates a significant improvement over the state-of-the-art, despite its relative simplicity (i.e., feature robustness is enforced via additional adversarial decoder with a GAN-like objective attempting to discriminate between the original and perturbed inputs). Finally, we conclude the talk with a discussion of severa directions for future work, which including drawing inspirations (e.g., inductive biases) from neuroscience [6], in order to develop truly broad and robust lifelong-learning AI systems.

Related work: [1] https://arxiv.org/abs/1909.08383 de Lange et al (2019) A continual learning survey: Defying forgetting in classification tasks. [2] https://arxiv.org/abs/1701.06106 Garg et al (2017). Neurogenesis-Inspired Dictionary Learning: Online Model Adaptation in a Changing World. IJCAI 2017. [3] https://arxiv.org/abs/2003.05856 Caccia et al (2020). Online Fast Adaptation and Knowledge Accumulation: a New Approach to Continual Learning. submitted. [4] (in preparation) Khetarpal et al (2020). Towards Continual Reinforcement Learning: A Review and Perspectives.
[5] https://arxiv.org/abs/2006.04621 Bashivan et al (2020). Adversarial Feature Desensitization. submitted. [6] https://xaqlab.com/wp-content/uploads/2019/09/LessArtificialIntelligence.pdf Sinz et al (2019). Engineering a Less Artificial Intelligence. Neuron.

Irina Rish
Sat 7:45 a.m. - 8:00 a.m.
Q&A with Irina Rish (Q&A)
Irina Rish, Shagun Sodhani, Sarath Chandar
Sat 8:00 a.m. - 9:00 a.m.
Sat 9:00 a.m. - 9:15 a.m.
Contributed Talk: Deep Reinforcement Learning amidst Lifelong Non-Stationarity (Talk)
Annie Xie
Sat 9:15 a.m. - 9:30 a.m.
Contributed Talk: Lifelong Learning of Factored Policies via Policy Gradients (Talk)
Jorge Mendez Mendez
Sat 9:30 a.m. - 10:00 a.m.

A life-long learning agent should learn not only to solve problems, but also to pose new problems for itself. In reinforcement learning, the starting problems are maximizing reward and predicting value, and the natural new problems are achieving subgoals and predicting what will happen next. There has been a lot of work that provides a language for learning new problems (e.g., on auxiliary tasks and general value functions), but precious little that actually learns them (e.g., McGovern on learning subgoal states). In this talk I present a general strategy for learning new problems and, moreover, for learning an endless cycle of problems and solutions, each leading to the other. I call this cycle the FOAK cycle, because it is based on Features, Options, And Knowledge, where “options” are temporally extended ways of behaving, and “knowledge” refers to an agent’s option-conditional model of the transition dynamics of the world. The new problems in the FOAK cycle are 1) to find options that attain state features and 2) to model the consequences of those options. As these problems are solved and the models are used in planning, more abstract features are formed and made the basis for new options and models, continuing the cycle. The FOAK cycle is intended to produce a model-based reinforcement learning agent with successively more abstract representations and knowledge of its world, in other words, a life-long learning agent.

Richard Sutton
Sat 10:00 a.m. - 10:15 a.m.
Q&A by Rich Sutton (Q&A)
Richard Sutton, Shagun Sodhani, Sarath Chandar
Sat 10:15 a.m. - 10:30 a.m.
Contributed Talk: Gradient Based Memory Editing for Task-Free Continual Learning (Talk)
Xisen Jin
Sat 10:30 a.m. - 11:30 a.m.

You can submit questions for the panel at https://app.sli.do/event/xf5illav/live/questions

Eric Eaton, Martha White, Doina Precup, Irina Rish, Harm van Seijen
Sat 11:30 a.m. - 11:45 a.m.
Concluding Remarks
Sarath Chandar, Shagun Sodhani

Author Information

Shagun Sodhani (Facebook AI Research)
Sarath Chandar (Polytechnique Montreal)
Balaraman Ravindran (Indian Institute of Technology)
Doina Precup (McGill University / DeepMind)

More from the Same Authors