ICML 2021 Tutorials

Continual Learning with Deep Architectures

Vincenzo Lomonaco · Irina Rish

[ Virtual ]

Humans have the extraordinary ability to learn continually from experience. Not only we can apply previously learned knowledge and skills to new situations, we can also use these as the foundation for later learning. One of the grand goals of Artificial Intelligence (AI) is building an artificial “continual learning” agent that constructs a sophisticated understanding of the world from its own experience through the autonomous incremental development of ever more complex knowledge and skills (Parisi, 2019).

However, despite early speculations and few pioneering works (Ring, 1998; Thrun, 1998; Carlson, 2010), very little research and effort has been devoted to address this vision. Current AI systems greatly suffer from the exposure to new data or environments which even slightly differ from the ones for which they have been trained for (Goodfellow, 2013). Moreover, the learning process is usually constrained on fixed datasets within narrow and isolated tasks which may hardly lead to the emergence of more complex and autonomous intelligent behaviors. In essence, continual learning and adaptation capabilities, while more than often thought as fundamental pillars of every intelligent agent, have been mostly left out of the main AI research focus.

In this tutorial, we propose to summarize the application of …

Natural-XAI: Explainable AI with Natural Language Explanations

Oana-Maria Camburu · Zeynep Akata

[ Virtual ]

Abstract

In this tutorial, we will present the emerging direction of explainability that we will refer to as Natural-XAI. Natural-XAI aims to build AI models that (1) learn from natural language explanations for the ground-truth labels at training time, and (2) provide such explanations for their predictions at deployment time. For example, a self-driving car would not only see at training time that it has to stop in a certain environment, but it would additionally be told why this is the case, e.g., “Because the traffic light in front is red.”. At usage time, the self-driving car would also be able to provide such natural language explanations for its actions, thus reassuring the passengers. This direction has recently received increasingly large attention.

Responsible AI in Industry: Practical Challenges and Lessons Learned

Krishnaram Kenthapadi · Ben Packer · Mehrnoosh Sameki · Nashlie Sephus

[ Virtual ]

Abstract

In this tutorial, we will present a brief overview of responsible AI, highlighting model explainability, fairness, and privacy in AI, key regulations/laws, and techniques/tools for providing understanding around web-based AI/ML systems. Then, we will focus on the application of explainability, fairness assessment/unfairness mitigation, and privacy techniques in industry, wherein we present practical challenges/guidelines for using such techniques effectively and lessons learned from deploying models for several web-scale machine learning and data mining applications. We will present case studies across different companies, spanning application domains such as search and recommendation systems, hiring, sales, lending, and fraud detection. We will emphasize that topics related to responsible AI are socio-technical, that is, they are topics at the intersection of society and technology. The underlying challenges cannot be addressed by technologists alone; we need to work together with all key stakeholders — such as customers of a technology, those impacted by a technology, and people with background in ethics and related disciplines — and take their inputs into account while designing these systems. Finally, based on our experiences in industry, we will identify open problems and research directions for the machine learning community.

Synthetic Healthcare Data Generation and Assessment: Challenges, Methods, and Impact on Machine Learning

Ahmed M. Alaa · Mihaela van der Schaar

Abstract

In this tutorial we provide an overview of state-of-the-art techniques for synthesizing the two most common types of clinical data; namely tabular (or multidimensional) data and time-series data. In particular we discuss various generative modeling approaches based on generative adversarial networks (GANs) normalizing flows and state-space models for cross-sectional and time-series data demonstrating the use cases of such models in creating synthetic training data for machine learning algorithms and highlighting the comparative strengths and weaknesses of these different approaches. In addition we discuss the issue of evaluating the quality of synthetic data and the performance of generative models; we highlight the challenges associated with evaluating generative models as compared to discriminative predictions and present various metrics that can be used to quantify different aspects of synthetic data quality.

From ML research to ML products: A path towards building models with real-world impact

Gholamreza Salimi-Khorshidi · Peyman Faratin

Abstract

Scientists in the field of machine learning (ML) – including deep learning (DL) -- aspire to build better models (usually judged by beating SOTA in well-defined tasks and datasets); successful applications of such models, on the other hand, are about product-market fit (PMF) in environments with ever-growing complexities. As many expect ML to play a bigger role in our society, ML scientists’ ability to influence this journey will depend on putting ML research in a PMF context and vice versa (i.e., optimising for market.fit(model.fit())+⍺*model.fit(market.fit()) instead of optimising for model.fit() alone). Therefore, in this tutorial we aim to cover the general principals of building AI products in the “real world”, covering topics such as product design/management, achieving product-market fit, and ML R&D in this context.

Schedule
All times are EST

+ Session 1 (11:00 a.m. - 11:15 a.m): Overview of tutorial and the core idea (R. Khorshidi)
+ Session 2 (11:15 a.m. - 11:45 a.m): Product Market Fit (R. Khorshidi)
- Break (11:45 a.m. - 12:00 p.m)
+ Session 3 (12:15 p.m. - 12:30 p.m): Build Measure Learn (R. Khorshidi)
+ Session 4 (12:30 p.m. - 1:00 p.m): Experiments and Metrics (R. Khorshidi)
- Break (1:00 p.m. - 1:15 p.m) …

Sparsity in Deep Learning: Pruning and growth for efficient inference and training

Torsten Hoefler · Dan Alistarh

Abstract

This tutorial will perform an detailed overview of the work on sparsity in deep learning, covering sparsifi- cation techniques for neural networks, from both the mathematical and implementation perspectives. We specifically aim to cover the significant recent advances in the area, and put them in the context of the foundational work performed on this topic in the 1990s.

Social Implications of Large Language Models

Hal Daumé III · Kate Crawford

Abstract

This tutorial will address the wider social and economic implications of large language models, such as ELMO (Peters et al., 2018), BERT (Devlin et al., 2019), GPT-2 and -3 (Radford et al., 2019; Brown et al., 2020), FlauBERT (Le et al., 2020), XLNet (Yang et al., 2019), CPM (Zhang et al., 2020), PALM (Bi et al., 2020), Switch C (Fedus et al., 2021) and others. Over the past few years the resources put into developing bigger language models trained on more data has been unparalleled. And yet, the full repercussions of this record concentration of resources has been little discussed. In this tutorial, we aim to address concerns around the economic, political, social, and legal impacts of the development of large language models.

Our tutorial includes guest presentations by:
Emily Bender
Su Lin Blodgett
Emma Strubell
Ari Waldman
Glen Weyl
Thanks to these five scholars for providing their expertise!

Unsupervised Learning for Reinforcement Learning

Aravind Srinivas · Pieter Abbeel

[ Virtual ]

Abstract

The tutorial will be about the intersection of Unsupervised Learning and Reinforcement Learning. Unsupervised Learning (UL) has really taken off in the past few years with the advent of language model based pre-training in natural language processing, and contrastive learning in computer vision. Some of the main advantages of unsupervised pre-training in these domains is the emergent data-efficiency in downstream supervised learning tasks. There’s a lot of interest in the community in terms of how these techniques can be applied to reinforcement learning and robotics. It may not be as straightforward given that RL and Robotics present further challenges compared to passive learning from images and text on the internet, due to the sequential decision making nature of the problem. This tutorial will cover the foundational blocks of how to apply and use unsupervised learning in reinforcement learning with the hope that people can take back knowledge of the latest state-of-the-art techniques and practices as well as the wide array of future possibilities and research directions in this challenging and interesting intersection.

Random Matrix Theory and ML (RMT+ML)

Fabian Pedregosa · Courtney Paquette · Thomas Trogdon · Jeffrey Pennington

[ Virtual ]

Abstract

In recent years, random matrix theory (RMT) has come to the forefront of learning theory as a tool to understand some of its most important challenges. From generalization of deep learning models to a precise analysis of optimization algorithms, RMT provides analytically tractable models.

Online and non-stochastic control

Elad Hazan · Karan Singh

[ Virtual ]

Abstract

In recent years new methods have emerged in control and reinforcement learning that incorporate techniques from regret minimization and online convex optimization. The resulting theory give rise to provable guarantees for some longstanding questions in control and reinforcement learning: logarithmic regret and fast rates, end-to-end LQG-LQR without system knowledge, Kalman filtering with adversarial noise, black-box control with provable finite-time guarantees, tight lower bounds for system identification, and more.
The main innovation in these results stems from an online control model which replaces stochastic perturbations by adversarial ones, and the goal of optimal control with regret minimization. We will describe the setting, as well as novel methods that are gradient-based and rely on novel convex relaxations.

Self-Attention for Computer Vision

Aravind Srinivas · Prajit Ramachandran · Ashish Vaswani

[ Virtual ]

Abstract

The tutorial will be about the application of self-attention mechanisms in computer vision. Self-Attention has been widely adopted in NLP, with the fully attentional Transformer model having largely replaced RNNs and now being used in state-of-the-art language understanding models like GPT, BERT, XLNet, T5, Electra, and Meena. Thus, there has been a tremendous interest in studying whether self-attention can have a similarly big and far-reaching impact in computer vision. However, vision tasks have different properties compared to language tasks, so a lot of research has been devoted to exploring the best way to apply self-attention to visual models. This tutorial will cover many of the different applications of self-attention in vision in order to give the viewer a broad and precise understanding of this subfield.

Privacy in learning: Basics and the interplay

Huishuai Zhang · Wei Chen

[ Virtual ]

Abstract

In the real world, more and more customers view privacy as a concern when using an AI service, especially when the customer content consists of sensitive data. Recent research demonstrates that large language model like GPT-2 can memorize content, which can be extracted by an adversary. This poses high privacy risk in deployed scenarios when models are trained on customer data. Differential privacy is widely recognized as a golden standard of privacy protection due to its mathematical rigor. To alleviate the privacy concern in machine learning, many research works have studied the machine learning with differential privacy guarantee. It is the time to clarify the challenge and opportunity for learning with differential privacy. In this tutorial, we first describe the potential privacy risk in machine learning models and introduce the background of differential privacy, then present the popular approaches of guaranteeing differential privacy in machine learning. In the rest of the tutorial, we highlight the interplay between learning and privacy. In the second section, we show how to utilize the learning property to improve the utility of private learning, especially with recent advances towards solving these challenges by exploiting the correlation across data points and the low-rank property of the …