Skip to yearly menu bar Skip to main content


Timezone: Asia/Seoul
Filter Events
Registration Desk
7:30 AM - 4:00 PM
Workshop
8:00 AM - 5:00 PM

Scaling laws -- precise power-law relationships between model performance and resources (parameters, data, compute) -- have become the central organizing principle of modern large-model training. Yet the theoretical foundations of scaling remain incomplete: despite rapid recent progress, the community still lacks a unified mathematical framework connecting optimizer dynamics, architecture choice, and data structure to the observed power-law exponents that govern training at scale. This year’s HiLD workshop focuses on building a rigorous science of scaling by bringing together theoreticians and practitioners who build and train frontier models.

... more
Workshop

Machine learning systems are increasingly deployed in open-ended and high-stakes environments, where distribution shift, adversarial manipulation, hallucinations, safety risks, and misalignment reveal fundamental limits of learning under incomplete information. A central challenge is the ability to recognise and reason about the limits of one’s own knowledge, especially in the presence of unknown unknowns. The 2nd Workshop on Epistemic Intelligence in Machine Learning brings together researchers from diverse areas of machine learning to develop principled and computationally tractable approaches to representing and operationalising epistemic intelligence. The workshop focuses on foundations of uncertainty beyond single-distribution representations, uncertainty-aware generative and foundation models, AI safety and alignment under objective uncertainty, and lifelong and continual learning in open worlds. By connecting theoretical frameworks with behavioural mechanisms such as abstention, deferral, querying, and safe adaptation, EIML aims to provide a unifying perspective on how learning systems can reason under unknown unknowns and guide robust, safe, and trustworthy real-world deployment.

... more
Workshop

Workshop on Mechanistic Interpretability

Andrew Lee ⋅ Ivan Arcuschin Moreno
8:00 AM - 5:00 PM

We propose a third Workshop on Mechanistic Interpretability – the study of how neural networks function – following highly successful workshops at ICML 2024 and NeurIPS 2025, the latter which attracted over 600 attendees. Mechanistic Interpretability is a cross-cutting area with relevance to multiple topics at ICML: anyone who has trained or interacted with neural networks has likely wondered how they work, and our current lack of understanding causes significant issues for safety and scientific understanding. We have designed our program to foster debate around the arising debate between pragmatic and ambitious approaches in the field, in addition to showcasing and sharing knowledge on emerging methodologies

... more
Workshop

SCALE: SCALABLE LEARNING AND OPTIMIZATION FOR EFFICIENT MULTIMODAL AI AGENTS

Souvik Kundu ⋅ Digbalay Bose ⋅ Sayan Nag ⋅ Jaehong Yoon ⋅ Manling Li ⋅ Hongyi Wang ⋅ Lanqing Guo ⋅ Sanjoy Chowdhury
8:00 AM - 5:00 PM

This workshop seeks to bring together researchers from diverse background to explore (but not limited to) emerging topics in, a) multi-modal agentic learning: learning algorithms, pipelines, and architectures for multimodal agents, spanning pretraining and fine-tuning to test-time tuning and adaptation; b) Efficient agentic AI systems: developing scalable and verifiable agentic AI systems across heterogeneous compute platforms with limited compute and memory budget, ; c) scaling of multi-modal agents: understanding and improving the test-time scaling and reasoning capabilities of multi-modal agentic systems, mixture-of-agents for task scaling; d) multi-modal agents for planning: pushing the boundaries of real life physical reasoning and planning for agentic AI. e) evaluation and benchmarking: principled metrics and benchmarks for reasoning, memory, robustness, and efficiency in multimodal agents; f) memory of agents: understanding and improving multi-modal agentic memory for reasoning capabilities.

... more
Workshop

Generative and Agentic AI for Biology

Wengong Jin ⋅ Lei Li ⋅ Divya Nori ⋅ Aditi Krishnapriyan ⋅ Christian Dallago ⋅ Ramith Hettiarachchi
8:00 AM - 5:00 PM

The 2024 Nobel Prize in Chemistry, awarded for AI-based protein structure prediction and protein design, underscored the transformative impact of machine learning on the life sciences. Generative AI models, including large language models, diffusion models, and foundation models for biological sequences and cells, have demonstrated remarkable success in modeling and designing biomolecules and biological systems. However, a new paradigm is emerging. Beyond generating biological sequences or structures, AI systems are beginning to act as agents: formulating hypotheses, planning experiments, interacting with tools and databases, and iteratively refining scientific strategies. This workshop aims to explore the future of AI for biology at the intersection of these two paradigms. Rather than focusing solely on incremental advances in generative modeling, we seek to engage the community in a deeper discussion about the conceptual and practical foundations of AI-driven biological discovery. Key questions include: * Will agentic AI subsume generative models, or are they complementary components of future scientific systems? * In what biological problems is agentic AI necessary? * What architectures are required for AI systems that reason across molecules, cells, tissues, and organisms? * How should we evaluate AI agents that participate in biological discovery? * What is the role of human scientists in an era of AI-driven hypothesis generation and experimentation? We aim to discuss these questions through invited talks, poster presentations, and panel discussions on the following topics: * Generative models for biomolecule and therapeutic design. * Agent-based systems for hypothesis generation, experimental planning, and closed-loop wet-lab integration. * Foundation models and world models for multi-scale biology. * Benchmarks and evaluation frameworks for autonomous scientific systems. * Human-AI collaboration paradigms in biological research. * Safety, governance, and ethical considerations of autonomous biological AI systems.

... more
Workshop

From Frames to Stories (F2S): Toward Reliable, Controllable and Trustworthy Long-Horizon Video Generation

Yu Lu ⋅ Junhao Dong ⋅ Enis Simsar ⋅ Hila Chefer ⋅ Ismini Lourentzou ⋅ Piotr Koniusz ⋅ Yi Yang
8:00 AM - 5:00 PM

Video generation has advanced rapidly for short clips, but minutes-long, multi-shot generation remains unreliable due to compounding errors, identity drift, and weak long-range coherence. Long-horizon video therefore provides a demanding testbed for long-context multimodal modeling, inference-time computation, and interactive generation, aligning closely with core ICML interests. We focus on three bottlenecks: (i) persistent state representation, what to store (and how to compress it) to ensure that identities, scene dynamics, and narrative facts remain consistent; (ii) interactive control that steers future states with rich and compositional signals (shot plans, localized edits, multimodal constraints, actions) over long horizons; and (iii) trustworthy evaluation via minutes-scale protocols that measure consistency and control adherence in a reproducible, hard-to-game way. We highlight research directions where conceptual and methodological advances, rather than model scale alone, drive progress under realistic academic compute budgets. The program combines invited talks, contributed spotlights, posters/demos, a panel, and breakout groups on open problems with report-back.

... more
Workshop

Continual Adaptation at Scale: Towards Sustainable AI

Ghada Sokar ⋅ Gintare Karolina Dziugaite ⋅ Mohammad Emtiyaz Khan ⋅ Rupam Mahmood ⋅ Martin Mundt ⋅ Daniel Marczak
8:00 AM - 5:00 PM

Training Foundation Models (FMs) is currently so costly that only few can afford it. The immense data, compute, and energy demands are increasingly unsustainable. Continual adaptation offers a viable alternative, where AI models can learn quickly and continually through every day interactions, just like humans and animals. Unfortunately, FMs lack this rapid adaptability: new behavior in FMs can be induced by prompting or fine-tuning, but there are no easy ways to quickly shape the behavior, for instance, to permanently add, remove, or modify their skill set in a sustainable way. This workshop aims to discuss new research directions that will enable fast continual adaptation at scale to drive more sustainable AI.

... more
Workshop

Deep Learning for Code: Towards Human-Centered Coding Agents

Terry Yue Zhuo ⋅ Zijian Wang ⋅ Zhiruo Wang ⋅ Wen-Ding Li ⋅ Terry Yue Zhuo ⋅ Giovanni Zappella ⋅ Qian Liu ⋅ Zijian Wang
8:00 AM - 5:00 PM

AI coding agents have rapidly improved in their ability to perform complex software engineering tasks autonomously. However, as these systems advance, the main bottleneck to real-world usefulness is shifting from task-solving capability to challenges in communication, oversight, and trust between humans and agents. This year, the 5th Deep Learning for Code (DL4C) workshop at ICML will focus on human-centered coding agents: systems designed not only to complete tasks, but to collaborate effectively with humans. Building on previous DL4C editions (ICLR '22, '23, '25; NeurIPS '25; https://dl4c.github.io), the workshop will highlight interaction-level questions such as task alignment, verifiability, steerability, and adaptability in human-agent workflows. We aim to bring together researchers from ML, NLP, HCI, and SE to develop shared evaluation methods, user-involved coding environments, and scalable approaches to studying human-AI collaborative coding. By emphasizing human-centered design, the workshop seeks to advance coding agents that are more controllable, interpretable, and broadly useful in practice.

... more
Workshop

AI as a Tool for Mathematics, Computer Science, and Machine Learning

Dmitriy Drusvyatskiy ⋅ Mikhail Belkin ⋅ Edgar Dobriban ⋅ Fanny Yang ⋅ Qingsong Wang
8:00 AM - 5:00 PM

Modern AI systems increasingly assist researchers with coding, exposition, and fragments of mathematical reasoning, yet turning these capabilities into dependable research progress remains nontrivial. This workshop focuses on AI as a practical research instrument for the mathematics/CS/ML community: not merely improving theorem-proving benchmarks, but developing transferable, reproducible workflows that help researchers generate, stress-test, and refine real results. The program will cover (i) AI-assisted mathematical research workflows, including iterative verification loops, decomposition and self-critique, multi-agent strategies, and common failure modes with concrete detection/mitigation tactics; (ii) tool-augmented reasoning, integrating LLMs with computation (code, symbolic algebra, numerics), literature navigation, and proof assistants (e.g., Lean) to reduce hallucinations and improve reliability; and (iii) research acceleration across ML/CS, including derivations, counterexample search, and experiment design methods that generalize across subfields. The workshop is structured as a full-day hybrid event with confirmed in-person invited talks, a demo/poster session featuring accepted contributions (4-page submissions emphasizing usable workflows), and a structured debate and panel on whether AI-generated analyses and conclusions will become as trustworthy as those of leading theoretical researchers within five years. The intended outcome is a durable community resource: a shared set of actionable practices for rigorous AI-assisted research.

... more
Workshop

Combining Theory and Benchmarks: Towards A Virtuous Cycle to Understand and Guarantee Foundation Model Performance

Brian Hu ⋅ Nathaniel Bottman ⋅ Yaoqing Yang ⋅ Yujun Yan ⋅ Guido Montufar ⋅ Jaejin Lee
8:00 AM - 5:00 PM

Benchmarks such as HELM [1] and Big-Bench [2] have significantly advanced quantitative model evaluation. However, current practice remains largely empirical and while measuring performance, does not provide guarantees on what capabilities a model has, when those capabilities will or will not manifest, and why. Debates over emergent abilities at scale illustrate this lack of predictive understanding [3,4]. In parallel, a substantial body of theoretical work addresses scaling laws for pre-training [5,6], generalization [7,8,9], and benchmark predictability [10]. Yet these theoretical advances are often disconnected from real-world benchmarking, and as a result, theoretical insights rarely inform benchmark design. This structural disconnect limits our ability to make reliable claims about model behavior, constrains trustworthy deployment, and slows the development of foundation models. This workshop will focus on advancing a predictive science of foundation model performance. We structure the workshop around three key research challenges: 1) Quantification of capabilities across task levels: How can we move from scores to formal, quantitative guarantees on performance across task levels? 2) Foundations of generalization and composition: Which mathematical frameworks can explain when and why models generalize? 3) Reliable and structured empirical evaluation: How should benchmarks be constructed to evaluate reasoning, robustness under distribution shift, and calibrated uncertainty? This workshop will convene researchers across mathematics, statistics, machine learning, and industry to catalyze a new research agenda that tightly couples theory and empirical evaluation of foundation models. By advancing frameworks that make performance predictable and quantifiable, we aim to influence: (i) how benchmarks are designed, (ii) how models are stress-tested, and (iii) how reliability claims are substantiated. These developments have direct implications for large-scale deployment, evaluation pipelines, and red-teaming practices in industry. More broadly, the workshop will help accelerate the emergence of a principled science of foundation models grounded in predictive theory, structured evaluation, and rigorous performance guarantees.

... more
Workshop

The Second Workshop on AI4NextG: AI and ML for Next-Generation Wireless at ICML 2026 aims to bridge the significant and urgent gap between AI/ML research and real-world wireless system development, particularly as NextG (e.g., 6G) standardization efforts accelerate. While AI/ML holds transformative potential for wireless networks, a persistent disconnect remains between advances in ML theory/algorithms and the practical constraints of reliability, standard compliance, latency, and deployment within legacy infrastructures. Building on the strong success of the inaugural AI4NextG workshop at NeurIPS 2025, which brought together AI/ML researchers and wireless experts from both academia and industry, this edition will place a deliberate emphasis on deepening academia–industry collaboration. Through invited talks, interactive panels, and technical presentations spanning the full protocol stack, the workshop seeks to catalyze co-designed research agendas, accelerate deployable AI-native wireless solutions, and position the ICML community at the forefront of next-generation wireless innovation.

... more
Workshop

Foundations of Deep Generative Models: Understanding Memorization, Generalization, and Reasoning

Wei Huang ⋅ Renyuan Xu ⋅ Qing Qu ⋅ Valentin De Bortoli ⋅ Molei Tao ⋅ John Vastola ⋅ Peng Wang ⋅ Xiao Li ⋅ GEORGIOS PAPAIOANNOU
8:00 AM - 5:00 PM

Recently, diffusion models, flow-based models, and autoregressive language models have emerged as a powerful class of deep generative models (DGMs) with remarkable generation capabilities across a wide range of applications, including image synthesis, video generation, natural language generation, and scientific discovery. Despite these successes, they still face significant challenges, particularly in understanding memorization, generalization, and reasoning, which limit their reliability, interpretability, and broader adoption in many scientific disciplines. This workshop will bring together researchers from both theoretical and applied communities to address these challenges, providing a focused forum for exchanging ideas, identifying key open problems, and fostering new collaborations in this rapidly evolving area.

... more
Workshop

Graph Foundation Models: A New Era for Graph Machine Learning

Charilaos Kanatsoulis ⋅ Xingyue Huang
8:00 AM - 5:00 PM

Graph-structured data are ubiquitous across science and industry, yet today’s graph machine learning (GML) pipelines remain largely task- and dataset-specific, limiting robustness and transferability. This workshop brings together researchers and practitioners to advance graph foundation models (GFMs): models that pretrain once and adapt broadly across heterogeneous, temporal, and multimodal graphs. We will catalyze exchange on core questions spanning: architectural choices (GNNs, Transformers, and LLM-integrated pipelines), graph tokenization and structural encodings, pretraining objectives and scaling laws, and principled evaluation for cross-graph transfer. The scope covers diverse domains, including knowledge graphs, molecular and biological networks, relational databases, recommender systems, and social networks, emphasizing both methodological rigor and real-world impact. Through invited keynotes, contributed talks, posters, and panel discussions, the workshop aims to (i) consolidate design principles for GFMs, (ii) establish shared datasets, metrics, and reproducible protocols, and (iii) chart a community roadmap for scalable, transferable, and trustworthy graph learning.

... more
Workshop

Workshop on Weight-Space Symmetries: from Foundations to Practical Applications

Yani Ioannou ⋅ Boris Knyazev ⋅ Ekaterina Lobacheva ⋅ Adnan Mohammed ⋅ Antonio Orvieto ⋅ Alexander Theus
8:00 AM - 5:00 PM

Neural networks are highly over-parameterized models whose weight spaces exhibit rich symmetries, for example, neuron permutations. These symmetries create large equivalence classes of functionally identical solutions and have profound implications for the structure of the loss landscape, optimization, and design of practical algorithms. While significant progress has been made in characterizing these symmetries and their effects, a unified understanding remains elusive. Simultaneously, there is growing interest in practical applications of weight-space symmetries, such as training acceleration, model merging, weight-space learning, and more. The goal of this workshop is to bring together researchers from academia and industry to translate theoretical advances in weight-space symmetries into practical, scalable methods, fostering a coherent framework and highlighting approaches that are computationally feasible at scale.

... more
Workshop
8:00 AM - 5:00 PM

We propose an ICML workshop on Technical AI Governance Research (TAIGR). TAIGR encompasses technical analysis and tools that support the effective governance of AI, such as evaluations, safeguards, and access controls. Despite increasing interest in TAIGR, the field lacks a consistent, dedicated venue for sharing research. This will be the second edition of the workshop, building on the inaugural technical AI governance workshop held at ICML 2025.

... more
Workshop
8:00 AM - 5:00 PM

Probabilistic approaches have been one of the core engines of machine learning for decades: they provide a language for uncertainty, latent structure, and decision-making under incomplete information or noisy observations. In parallel, generative modeling has long been an important branch of this toolkit from large language models to diffusion models. While their empirical success has largely been driven by scaling and benchmark-oriented engineering efforts, probabilistic principles have not faded into irrelevance; if anything, they have become increasingly vital for leveraging models in more complex tasks in the era of foundation models and real-world deployment. The mission of this workshop is to create a forum for research that is driven not solely by prevailing trends, but by well-reasoned scientific beliefs and long-term vision. We aim to bring together researchers working on structured probabilistic inference, generative modeling, and their intersections with modern foundation models. We particularly encourage contributions that explore emerging, unconventional, or underexplored directions that may shape the future of the field. By fostering dialogue across communities, including theoretical probabilistic modeling, generative modeling, information theory, and large-scale foundation model research, we hope to identify enduring principles, rediscover overlooked ideas, and inspire new frameworks that unify structure, scalability, and uncertainty. Ultimately, this workshop seeks to highlight that probabilistic thinking is not only foundational to the past and present of machine learning but also essential to its future trajectory.

... more
Workshop

AI for Law Workshop

Yu Fan ⋅ Fabio Fehr ⋅ Aniket Kesari ⋅ Robert Mahari ⋅ Julian Nyarko ⋅ Hai Park ⋅ Yang Tian ⋅ Frederike Zufall
8:00 AM - 5:00 PM

Recent advances in machine learning have substantially improved general-purpose reasoning, multimodal understanding, and test-time scaling. Yet law remains a uniquely demanding and high-stakes domain that exposes the limits of generic AI progress. Many legal tasks require structured, long-form deductive and inductive reasoning grounded in doctrine, sensitivity to jurisdictional and linguistic variation, and robustness in settings where errors carry serious real-world consequences. They also raise fundamental questions about evaluation, fairness, and access to justice. Legal reasoning thus complements established AI reasoning domains, such as mathematics and coding, by emphasizing context-sensitive, norm-governed inference embedded in real-world institutions. This workshop centers on a core question: What does it mean for an AI system to be competent in law, and how can such competence be built, evaluated, and validated across jurisdictions and languages while enabling equitable access to justice? We structure the discussion around three interconnected themes: - AI for Legal Reasoning, focusing on domain-specific supervision, doctrinal grounding, and task design for robust legal inference; - AI Evaluation for Law, addressing reliable, risk-aware, and jurisdiction-sensitive evaluation paradigms; and - AI for Access to Justice, examining the technical and institutional conditions under which AI systems improve, or risk undermining, equitable legal access. To operationalize these themes, we will host a multilingual shared task on long-form legal reasoning across jurisdictions and languages, emphasizing doctrinal and jurisdictional grounding, reasoning quality, and cross-lingual robustness. By bridging machine learning and legal scholarship, the workshop aims to articulate a research agenda for AI systems that are not only more capable, but also more legally grounded and socially responsible.

... more
Workshop

Learning to Listen: ICML 2026 Workshop on Machine Learning for Audio

Alice Baird ⋅ Chris Donahue ⋅ Sander Dieleman ⋅ Brian Kulis ⋅ David Liu ⋅ Rachel Manzelli ⋅ Shrikanth Narayanan
8:00 AM - 5:00 PM

Machine learning for audio has seen heightened interest over the past year, driven by audio language models and multimodal/foundation models for understanding and generating speech, music, and audio events, as well as rising demand for low-latency voice agents and real-time transcription. We propose the Machine Learning for Audio workshop at ICML 2026 to provide a dedicated forum for audio researchers and practitioners to exchange ideas, share tools and benchmarks, form collaborations, and engage in timely ethical discussion around generative audio and audio foundation models. The workshop will cover topics including generative synthesis, enhancement/denoising, datasets and augmentation, classification, transcription, source separation, and multimodal problems, and will solicit up to 4-page extended abstracts, plus a poster/demo session for live presentations. The program will feature invited talks from leading academic and industry researchers spanning speech, music, and general audio ML. Additionally, the workshop organizers will release several refreshed audio datasets alongside the workshop, for use in contributed work.

... more
Workshop

Trustworthy AI for Good Workshop

Terry J. C. Zhang ⋅ Zhijing Jin ⋅ Milind Tambe ⋅ Rada Mihalcea ⋅ Changling Li ⋅ Joan Nwatu ⋅ David Lie
8:00 AM - 5:00 PM

Agentic systems increasingly shape how billions of people engage with public institutions, civic discourse, and society at large. While many effort has been focusing on making models safer in avoiding harmful output, it's more important for these improvements in model development to translate social good at scale. This workshop aims to brings together the AI safety community, which often focuses on what a model can do as an individual system, with the AI for societal good and policy/governance communities, which focus on what models do when deployed across populations. We aim to bring together researchers, practitioners, policymakers, and opinion leaders to connect these perspectives so that safer models also help strengthen society at large.

... more
Workshop

Foundation-model agents increasingly run in closed-loop with tools, memory, and multi-step action. This long-horizon interaction exposes failures that single-turn evaluation often misses: error cascades over trajectories, brittle tool use under interface shifts, unstable memory binding/read-write over time, weak recovery (diagnosis/backtracking/repair), and optimization-driven policy contraction (templated behavior, diversity/reasoning collapse). Failure Modes in Agentic AI (FMAI) proposes a focused platform that treats these failures as actionable research objects, with four deliverables: (1) operational definitions with explicit boundaries and loop localization; (2) minimal, reproducible triggers; (3) comparable protocols with trace-level diagnostics beyond terminal success; and (4) verifiable mitigation and repair strategies (including strong negative results). FMAI aligns ICML’s strengths in optimization, generalization, and evaluation with realistic agent loops to standardize how we diagnose and fix agentic failures.

... more
Workshop

Culture x AI: Evaluating AI as a Cultural Technology

Cody Kommers ⋅ Drew Hemment ⋅ Meredith Martin ⋅ Canfer Akbulut ⋅ Matthew Wilkens ⋅ Adam Sobey
8:00 AM - 5:00 PM

Generative AI is increasingly recognised as a social and cultural technology. These systems process an enormous amount of social data to produce novel cultural artefacts, such as text, images, and videos. While much progress has been made in evaluating cultural aspects of AI, it has tended to focus on harm mitigation: identifying and preventing moral violations, the spread of bias and misinformation, and deviation from human values. But a more positive or constructive notion of culture in AI remains underdeveloped. How can we evaluate cultural aspects of AI technology in a way that not only seeks to avoid failure, but gives a more robust definition of success?

This workshop covers current approaches for evaluating cultural aspects of generative AI. Our primary focus is on work that aims to bring ideas and techniques from the humanities, arts, and qualitative social sciences upstream in AI development. We'll bring together a range of work at the intersection of culture and AI, with the goal of not just studying the effects of AI after deployment but also in actively shaping the design of the technology itself. The workshop will give special focus to research that seeks to articulate a positive vision for cultural AI.

... more
Workshop

RLxF: RL from World Feedback

Shao-Hua Sun ⋅ Xingyou Song ⋅ Yash Akhauri
8:00 AM - 5:00 PM

This workshop explores a shift beyond human preference signals by treating world feedback 🌍 —measurable signals from real-world interactions such as efficiency, safety, health, performance, and economic outcomes—as a first-class training signal for reinforcement learning systems. The goal of this workshop is to move beyond human feedback to train reinforcement learning systems using world-grounded learning signals (e.g., efficiency, safety, and economic outcomes) that better reflect the true consequences of agent behavior. Bringing together researchers across reinforcement learning, foundation models, robotics, systems, and AI alignment, it focuses on modeling and integrating heterogeneous, noisy, and delayed feedback into modern learning pipelines. Through invited talks, contributed papers, and interactive panels, the workshop aims to clarify core challenges, develop shared frameworks, and advance scalable, robust, and deployable learning paradigms grounded in real-world consequences.

... more