Tutorials
Tutorial
Abstract
This tutorial explores the intersection of generative AI and reinforcement learning, demonstrating how generative models can be understood as RL agents and environments, and conversely, how RL can be viewed as generative modeling. It aims to bridge the gap between these fields, showing how insights from each can enhance the other. The workshop will cover topics such as reinterpreting generative AI training through an RL lens, adapting generative AI to build new RL algorithms, and understanding how AI agents interacting with tools and humans create a new generative model. It will also discuss future directions and open problems, focusing on how RL can shape the future of foundation model training and enable generative AI systems to construct their own knowledge.
Tutorial
Abstract
This tutorial focuses on the increasingly important area of differentially private (DP) synthetic data generation, addressing the need for robust anonymization in machine learning. Creating DP synthetic data allows for data sharing and analysis without compromising individuals' privacy, opening up possibilities for collaborative research and model training. The tutorial aims to bridge the gap between various related fields, such as DP training, DP inference, and empirical privacy testing, providing a comprehensive guide for generating DP synthetic data across different data types.
The tutorial will cover various aspects of DP synthetic data generation, starting with an introduction to different types of synthetic data and their benefits. It will then provide a brief overview of differential privacy, focusing on the key concepts needed to understand the subsequent sections. The core of the tutorial will delve into specific methods for generating DP synthetic data for tabular, image, and text data, with a significant emphasis on text data generation. The tutorial will elaborate on main components of a DP synthetic data generation system including what privacy guarantees to aim for, and what contribution constraints to apply on the data. It will also review best practices for handling sensitive data, and empirical privacy testing. Finally, …
Tutorial
Abstract
The formal basis of the theory of computation lies in the study of languages, subsets of Σ*, the set of all strings over an alphabet Σ. Models of computation can be taxonomized into the languages they can decide on, i.e., which languages a model can be used to determine membership of. For instance, finite-state automata can decide membership in the regular languages. Language models are probabilistic generalizations of language where the notion of a set is relaxed into one of a probability distribution over Σ*. Recently, language models parameterized using recurrent neural networks, transformers, and state-space models have achieved enormous success in natural language processing. Similarly to how theorists have taxonomized models of deterministic computation, researchers have been made to taxonomize the expressivity of language models based on various architectures in terms of the distributions over strings they can represent. This tutorial presents a self-contained overview of the formal methods used to taxonomize the expressivity of language models, which encompass formal language and automata theory, various forms of formal logic, circuit complexity, and programming languages such as RASP.
For example, we illustrate how transformers, under varying assumptions, can be characterized by different fragments of formal logic.
Tutorial
Abstract
At the heart of deep learning’s transformative impact lies the concept of scale--encompassing both data and computational resources, as well as their interaction with neural network architectures.
Scale, however, presents critical challenges, such as increased instability during training and prohibitively expensive model-specific tuning. Given the substantial resources required to train such models, formulating high-confidence scaling hypotheses backed by rigorous theoretical research has become paramount. The first part of the tutorial will provide an overview of significant advances in the theory of scaling in deep learning, covering its historical foundations, recent breakthroughs, and practical implications for training large-scale models.
To bridge theory and practice, the tutorial explores another key mathematical ingredient of scaling: the numerical solution algorithms commonly employed in deep learning, spanning domains from vision to language models. We unify these algorithms under a common master template, making their foundational principles transparent. In doing so, we reveal the interplay between adaptation to smoothness structures via online learning and the exploitation of optimization geometry through non-Euclidean norms.
Our exposition moves beyond simply building larger models--it emphasizes strategic scaling, offering insights that promise to advance the field while economizing on resources.
Tutorial
Abstract
There are many different notions of bias and fairness. When comparing subpopulations, an especially important dichotomy is between (1) equal or equitable average outcomes and (2) equal or equitable treatment. In the particular context considered here, "equal treatment" and "equal opportunity" are not too different. However, comparing the average outcome of one subpopulation to another is different and sometimes less desirable than comparing the outcomes of pairs of individuals (one individual from each subpopulation) for which the individuals in each pair are similar. The latter requires comparing outcomes via "conditioning on" or "controlling for" confounding covariates.
Conditioning on or controlling for variates helps compare only those who are comparable. That often means matching up people by their age or income, for example, and then looking at differences in results between people with similar ages or similar incomes. Yet that raises the question: how many people with exactly the same age or exactly the same income are in the data? If there are too few, they will be unrepresentative. When there are too few, the randomness in the results fails to average away. This would seem to call for matching up people whose ages or incomes are only close, but not …
Tutorial
Abstract
Sequential anytime-valid inference (SAVI) provides measures of statistical evidence and uncertainty --- e-values and e-processes for testing and confidence sequences for estimation --- that remain valid at all stopping times. These allow for continuous monitoring and analysis of accumulating data and optional stopping for any reason. These methods crucially rely on nonnegative martingales, which are wealth processes of a player in a betting game, thus yielding the area of "game-theoretic statistics". This tutorial will present the game-theoretic philosophy, intuition, language and mathematics behind SAVI, summarized in a recent a new book https://arxiv.org/pdf/2410.23614, to be published before ICML as the first edition of the new book series, Foundations and Trends in Statistics.
Tutorial
Abstract
Associative Memories like the famous Hopfield Networks are elegant models for describing fully recurrent neural networks whose fundamental job is to store and retrieve information. In the past few years they experienced a surge of interest due to novel theoretical results pertaining to their information storage capabilities, and their relationship with SOTA AI architectures, such as Transformers and Diffusion Models. These connections open up possibilities for interpreting the computation of traditional AI networks through the theoretical lens of Associative Memories. Additionally, novel Lagrangian formulations of these networks make it possible to design powerful distributed models that learn useful representations and inform the design of novel architectures. This tutorial provides an approachable introduction to Associative Memories, emphasizing the modern language and methods used in this area of research, with practical hands-on mathematical derivations and coding notebooks.
Tutorial
Abstract
Continuous-time generative models—particularly diffusion- and flow-based models—have emerged as a dominant paradigm in generative AI, with applications in image, video, molecular, and audio synthesis, as well as scientific modeling. Despite their success, the field’s rich mathematical structure, varied terminology, and subtle theoretical foundations often lead to confusion and fragmented understanding.
This tutorial offers a clear, unified, and accessible introduction to continuous-time generative models. Beginning with the simplified lens of rectified flow, we build a streamlined conceptual framework to support systematic exploration of the algorithmic landscape, while minimizing unnecessary mathematical overhead. We clarify commonly confused ideas and untangle key relationships—such as flow vs. diffusion, and the interplay between interpolation, noise schedules, and samplers. We also touch on advanced topics including distillation, control, and discrete and constrained generation in flow and diffusion models.
Tutorial
Abstract
Large Language Model (LLM) alignment has become an increasingly critical topic in contemporary AI research, especially as LLMs continue to scale and integrate into real-world applications. Ensuring that LLMs generate outputs aligned with human values, preferences, and ethical considerations is essential for their safe and effective deployment. This tutorial aims to provide a comprehensive introduction to LLM alignment methods, offering a structured and accessible entry point for researchers and practitioners interested in the field. It will present key concepts and challenges, introduce fundamental approaches such as Reinforcement Learning from Human Feedback (RLHF) and Direct Preference Optimization (DPO), and, building on these foundations, review a spectrum of refinements and variants. In addition, it will cover recent advancements in game-theoretical approach to alignment and theoretical frameworks that provide a deeper understanding of alignment methodologies. Beyond theoretical insights, the tutorial will emphasize the practical aspects of LLM alignment, illustrating how these techniques are applied in real-world scenarios and guiding participants in building intuition about alignment strategies. By the end of the tutorial, attendees will gain a solid foundation in LLM alignment, equipping them with the knowledge needed to critically engage with the field, understand current research trends, and explore future directions.
Tutorial
Abstract
Since the inception of language models (LLMs), considerable attention has been directed toward the field of AI safety. These efforts aim to identify a range of best practices—including evaluation protocols, defense algorithms, and content filters—that facilitate the ethical, trustworthy, and reliable deployment of LLMs and related technologies. A key component of AI safety is model alignment, a broad concept referring to algorithms that optimize the outputs of LLMs to align with human values. And yet, despite these efforts, recent research has identified several failure modes—referred to as jailbreaks—that circumvent LLM alignment by eliciting unsafe content from a targeted model. And while initial jailbreaks targeted the generation of harmful information (e.g., copyrighted or illegal material), modern attacks seek to elicit domain-specific harms, such as digital agents violating user privacy and LLM-controlled robots performing harmful actions in the physical world. In the worst case, future attacks may target self-replication or power seeking behaviors. The insidious nature of jailbreaking attacks represents a substantial obstacle to the broad adoption of LLMs. Therefore, it is critical for the machine learning community to study these failure modes and develop effective defense strategies that counteract them.
Over the past two years, research in both academia and industry …
Tutorial
Abstract
Diffusion models have recently gained attention as a powerful class of deep generative models, achieving state-of-the-art results in data generation tasks. In a nutshell, they are designed to learn an unknown data distribution starting from Gaussian noise, mimicking the process of non-equilibrium thermodynamic diffusion. Despite their outstanding empirical successes, the mathematical and algorithmic foundations of diffusion models remain far from mature. For instance: (i) Generalization: it remains unclear how diffusion models, trained on finite samples, can generate new and meaningful data that differ from the training set; (ii) Efficiency: due to the enormous model capacity and the requirement of many sampling steps, they often suffer from slow training and sampling speeds; (iii) Controllability: it remains computationally challenging and unclear how to guide and control the content generated by diffusion models, raising challenges over controllability and safety, as well as solving inverse problems across many scientific imaging applications.
This tutorial will introduce a mathematical framework for understanding the generalization and improving the efficiency of diffusion models, through exploring the low-dimensional structures in both the data and model. We show how to overcome fundamental barriers to improve the generalization, efficiency, and controllability in developing diffusion models, by exploring how these models adaptively …
Tutorial
Abstract
Mechanistic interpretability (MI) is an emerging sub-field of interpretability that seeks to understand a neural network model by reverse-engineering its internal computations. Recently, MI has garnered significant attention for interpreting transformer-based language models (LMs), resulting in many novel insights yet introducing new challenges. Given how fast this topic is now attracting the ML/AI community's attention, the goal of this tutorial is to provide a comprehensive overview of MI for LMs, including its historical contexts, the various techniques to implement and evaluate MI, findings and applications based on MI, and future challenges. The tutorial will particularly be presented following an innovative Beginner's Roadmap that the presenters carefully curated, aiming to enable researchers new to MI to quickly pick up this field and leverage MI techniques in their LM applications.
Successful Page Load