[ Room 307 ]
Scientific progress in machine learning is driven by empirical studies that evaluate the relative quality of models. The goal of such an evaluation is to compare machine learning methods themselves, not to reproduce single test-set evaluations of particular optimized instances of trained models. The practice of reporting performance scores of single best models is particularly inadequate for deep learning because of a strong dependence of their performance on various sources of randomness. Such an evaluation practice raises methodological questions of whether a model predicts what it purports to predict(validity), whether a model’s performance is consistent across replications of the training process (reliability), and whether a performance difference between two models is due to chance (significance). The goal oft his tutorial is to provide answers to these questions by concrete statistical tests. The tutorial is hands-on and accompanied by a textbook (Riezler and Hagmann,2021) and a webpage including R and Python code: https://www.cl.uni-heidelberg.de/statnlpgroup/empirical_methods/
[ Hall F ]
Deep neural networks have achieved outstanding success in many tasks ranging from computer vision, to natural language processing, and robotics. However such models are still pale in their ability to understand the world around us, as well as generalizing and adapting to new tasks or environments. One possible solution to this problem are models that comprehend causality, since such models can reason about the connections between causal variables and the effect of intervening on them. However, existing causal algorithms are typically not scalable nor applicable to highly nonlinear settings, and they also assume that the causal variables are meaningful and given. Recently, there has been an increased interest and research activity at the intersection of causality and deep learning in order to tackle the above challenges, which use deep learning for the benefit of causal algorithms and vice versa. This tutorial is aimed at introducing the fundamental concepts of causality and deep learning for both audiences, providing an overview of recent works, as well as present synergies, challenges and opportunities for research in both fields.
[ Ballroom 1 & 2 ]
Machine learning algorithms leak a significant amount of information about their training data. A legitimate user of a model can reconstruct sensitive information about the training data, by having access to its predictions or parameters. Given that all privacy policies and regulations require privacy auditing of (machine learning) algorithms, we are interested in a generic approach to perform quantitative reasoning about the privacy risks of various machine learning algorithms. Differentially private machine learning is currently the most widely accepted framework for privacy-preserving machine learning on sensitive data. The framework prescribes a rigorous accounting of information leakage about the training data through the learning algorithm using statistical divergences. However, it is often difficult to interpret this mathematical guarantee in terms of how a randomized algorithm limits how much an adversary can infer about one's data. For example, if a model is trained on my private emails containing personal information such as credit card number, does DP epsilon = 10 prevent my credit card number from being leaked by the model? If I am a patient participating in a personalized cancer treatment prediction study, does DP epsilon = 5 prevent others from identifying my membership (and hence my cancer positivity) in this …
[ Room 307 ]
Climate change is one of the greatest challenges that society faces today, requiring rapid action from across society. In this tutorial, we will provide an introduction to climate change, what it means to address it, and how machine learning can play a role. From energy to agriculture to disaster response, we will describe high-impact problems where machine learning can help, e.g., by providing decision-relevant information, optimizing complex systems, and accelerating scientific experimentation. These problems encompass exciting opportunities for both methodological innovation and on-the-ground implementation. We will also describe avenues for machine learning researchers and practitioners to get involved, alongside key considerations for the responsible development and deployment of such work. While this tutorial will primarily discuss opportunities for machine learning to help address climate change, it is worth noting that machine learning is a general-purpose technology that can be used for applications that both help and hinder climate action. In addition, machine learning has its own computational and hardware footprint. We will therefore briefly present a framework for understanding and contextualizing machine learning’s overall climate impacts, and describe associated considerations for machine learning research and practice as a whole. Through the course of this tutorial, we hope that participants will …
[ Ballroom 1 & 2 ]
This tutorial will give an overview of the theoretical foundations of interactive decision making (high-dimensional/contextual bandits, reinforcement learning, and beyond), a promising paradigm for developing AI systems capable of intelligently exploring unknown environments. The tutorial will focus on connections and parallels between supervised learning/estimation and decision making, and will build on recent research which provides (i) sample complexity measures for interactive decision making that are necessary and sufficient for sample-efficient learning, and (ii) unified algorithm design principles that achieve optimal sample complexity. Using this unified approach as a foundation, the main aim of the tutorial will be to give a bird’s-eye view of the statistical landscape of reinforcement learning (e.g., what modeling assumptions lead to sample-efficient algorithms). Topics covered will range from basic challenges and solutions (exploration in tabular RL, policy gradient methods, contextual bandits) to the current frontier of understanding. We will also highlight practical algorithms.
[ Hall F ]
One of the key challenges in developing intelligent and autonomous learning agents is their ability to effectively interact with humans. In this tutorial, we plan to cover the theoretical and practical foundations of interactive agents. Specifically, in the first part of the tutorial, we will focus on models of human behavior in isolation, how these models can be used for effective coordination and how they can be optimized for influencing the partner. In the second part of the tutorial, we will continue by introducing co-adaptation settings, where the human preferences are non-stationary and they adapt, and we will discuss how this leads to emergence of new norms, conventions, and equilibria. Finally, we will wrap up by introducing approaches for inferring human partner preferences using a range of offline and online sources of data present in interactive domains. Throughout this tutorial, we will also go over concrete examples from applications in autonomous driving, mixed-autonomy traffic network, personal robotics, and multi-agent games.
[ Room 307 ]
AI plays an increasingly prominent role in modern society since decisions that were once made by humans are now delegated to automated systems. These systems are currently in charge of deciding bank loans, criminals' incarceration, and the hiring of new employees, and it is not hard to envision that soon they will underpin most of the society's decision infrastructure. Despite the high stakes entailed by this task, there is still a lack of formal understanding of some basic properties of such systems, including issues of fairness, accountability, and transparency. In this tutorial, we introduce a framework of causal fairness analysis, with the intent of filling in this gap, i.e., understanding, modelling, and possibly solving issues of fairness in decision-making settings. The main insight of our approach will be to link the quantification of the disparities present in the observed data with the underlying, and often unobserved causal mechanisms that generate the disparity in the first place. We will study the problem of decomposing variations, which results in the construction of empirical measures of fairness that attribute such variations to causal mechanisms that generated them. Such attribution of disparity to specific causal mechanisms will allow us to propose a formal and …
[ Ballroom 1 & 2 ]
Sampling from a target probability distribution whose density is only known up to a normalisation constant is a fundamental problem in statistics and machine learning. While the literature on optimization for machine learning has developed widely in the past decade, with fine convergence rates for some methods, the literature on sampling remained mainly asymptotic until very recently. Since then, the Machine Learning community has been increasingly interested in the non asymptotic analysis of sampling algorithms, or in designing new schemes to improve the complexity of sampling. Interestingly, approximating a target probability distribution can be cast as an optimization problem where the objective functional measures the dissimilarity to the target distribution. In particular, the Kullback-Leibler divergence (or relative entropy) with respect to the target distribution is a suitable objective functional when the normalisation constant is intractable, as it is commonly the case in Bayesian inference. This optimization problem can be addressed using optimization techniques over a space of probability measures. The theory of Wasserstein gradient flows provides tools to solve this optimization problem. Indeed, Wasserstein gradient flows are continuous paths of distributions that decrease the objective functional. Moreover, several sampling algorithms such as Langevin Monte Carlo or Stein Variational Gradient Descent …
[ Hall F ]
In recent years, researchers in ML and systems have been working together to bring big models -- such as GPT-3 with 175B parameters -- into research and production. It has been revealed that increasing model sizes can significantly boost ML performance, and even lead to fundamentally new capabilities.
However, experimenting and adopting big models call for new techniques and systems to support their training and inference on big data and large clusters. This tutorial identifies research and practical pain points in model-parallel training and serving. In particular, this tutorial introduces new algorithmic techniques and system architectures for addressing the training and serving of popular big models, such as GPT-3, PaLM, and vision transformers. The tutorial also consists of a session on how to use the latest open-source system toolsets to support the training and serving of big models. Through this tutorial, we hope to lower the technical barrier of using big models in ML research and bring the big models to the masses.