ICML 2025 Workshops

CODEML: Championing Open-source DEvelopment in Machine Learning

Workshop

Geoff Pleiss · Jonathan Wenger · Jūlija Pečerska · Alina Selega · Frank Schneider

Abstract

Open-source software (OSS) development is a cornerstone of modern machine learning research. However, issues such as the sustainability of long-term projects, software reliability, and proper academic acknowledgment of maintenance and contributions are often overlooked. This workshop aims to identify and discuss strategies for successful and sustainable open-source development in ML while also proposing solutions to these challenges. Additionally, the workshop will provide a platform to recognize the efforts of open-source contributors in the field. We will bring together machine learning researchers, engineers, industrial practitioners, and software development experts. The workshop will feature invited talks, panel discussions with experts, and workshop paper submissions from open-source contributors in machine learning.

Programmatic Representations for Agent Learning

Workshop

Shao-Hua Sun · Levi Lelis · Xinyun Chen · Shreyas Kapur · Jiayuan Mao · Ching-An Cheng · Anqi Li · Kuang-Huei Lee · Leslie Kaelbling

Abstract

This workshop explores the use of programmatic representations to enhance the interpretability, generalizability, efficiency, and scalability of agent learning frameworks. By leveraging structured representations—such as symbolic programs, code-based policies, and rule-based abstractions—agents can achieve greater interpretability, improved generalization, and enhanced efficiency. Programs can explicitly encode policies, reward functions, task structures, and environment dynamics, providing human-understandable reasoning while reducing the reliance on massive data-driven models. Furthermore, programmatic representations enable modularity and compositionality, allowing agents to efficiently reuse knowledge across tasks and adapt with minimal retraining. By bringing together the sequential decision-making community—including researchers in reinforcement learning, imitation learning, planning, search, and optimal control—with experts in program synthesis and code generation, this workshop aims to tackle the fundamental challenges of agent learning at scale and drive progress toward interpretable, generalizable, verifiable, robust and safe autonomous systems across domains ranging from virtual agents to robotics.

DIG-BUGS: Data in Generative Models (The Bad, the Ugly, and the Greats)

Workshop

Khoa Doan · Franziska Boenisch · Adam Dziedzic · Aniruddha Saha · Viet Anh Nguyen · Zhenting Wang · Bo Li · Heather Zheng

Abstract

Generative models have become extremely powerful and are now integral to various aspects of daily life from creative arts to customer service. Given their increasing interaction with people, ensuring their trustworthiness is crucial. This workshop centers on the idea that the safety and reliability of generative models are deeply connected to the nature and treatment of their training data. We aim to explore the hypothesis that building reliable and trustworthy artificial intelligence (AI) systems based on generative models must start with high-quality and responsibly managed data. The workshop will focus on several key areas where training data impacts the trustworthiness of generative models. Among others, we will address 1) **privacy** concerns, highlighting how improper inclusion and handling of sensitive information in the training data can lead to significant privacy violations; 2) **safety** risks, like backdoors and data poisoning that threaten robust generations; and 3) the impact of **biases** in generative models' training data, which can cause models to perpetuate or even amplify societal biases, resulting in unfair outcomes. Through expert talks, panel discussions, and interactive sessions, participants will delve into these issues and explore strategies for developing safe, trustworthy, and reliable generative models. This workshop aims to foster collaboration and …

3rd Workshop on High-dimensional Learning Dynamics (HiLD)

Workshop

Atish Agarwala · Aukosh Jagannath · Jason Lee · Bruno Loureiro · Inbar Seroussi

Abstract

Modern machine learning applications face the challenge of extracting insights from high-dimensional datasets. The 3rd High-dimensional Learning Dynamics (HiLD) Workshop focuses on predicting and analyzing the behavior of learning algorithms in regimes where both the number of samples and parameters are large. This workshop aims to advance research and foster collaboration in several key areas: 1. Developing tractable models and dynamical frameworks to explain phenomena observed in deep neural networks (DNNs) and foundation models; 2. Establishing mathematical frameworks for neural scaling laws as network width and depth approach infinity; 3. Identifying and characterizing relevant observable quantities in high-dimensional limits; 4. Understanding the provable effects of optimization algorithms, hyperparameters, and neural architectures on training and test dynamics. The HiLD Workshop will unite experts from random matrix theory, optimization, high-dimensional statistics/probability, and statistical physics to share diverse perspectives on these challenges. By bringing together theorists and practitioners from machine learning with researchers from these adjacent fields, we aim to create new collaborations between communities that often do not interact. Through talks, poster sessions, and panel discussions, the workshop will explore the fundamental dynamics of learning algorithms in high-dimensional settings. This year's workshop theme is "Navigating Complexity: Feature Learning Dynamics at Scale."

Machine Unlearning for Generative AI

Workshop

Vaidehi Patil · Mantas Mazeika · Yang Liu · Katherine Lee · Mohit Bansal · Bo Li

Abstract

Generative AI models are trained on internet-scale datasets, yielding powerful capabilities but also introducing risks like copyright infringement, PII leakage, and harmful knowledge. Targeted removal or unlearning of sensitive data is challenging, as retraining on curated sets is computationally expensive, driving research into machine unlearning and model editing. Yet approaches like RLHF only suppress undesirable outputs, leaving underlying knowledge vulnerable to adversarial extraction. This raises urgent privacy, security, and legal concerns, especially under the EU’s GDPR “right to be forgotten”. Because neural networks encode information across millions of parameters, precise deletion without degrading performance is complex, and adversarial or whitebox attacks can recover ostensibly erased data. This workshop brings together experts in AI safety, privacy, and policy to advance robust, verifiable unlearning methods, standardized evaluation frameworks, and theoretical foundations. By achieving true erasure, we aim to ensure AI can ethically and legally forget sensitive data while preserving broader utility.

Tiny Titans: The next wave of On-Device Learning for Foundation Models (TTODLer-FM)

Workshop

Stefanos Laskaridis · Samuel Horváth · Berivan Isik · Peter Kairouz · Bilge Acun · Christina Giannoula · Angelos Katharopoulos · Martin Takac · Nicholas Lane

Abstract

The rapid evolution of Deep Learning, propelled by transformer-based architectures and significant hardware advancements, has unlocked unprecedented capabilities across diverse domains, from biological sciences to autonomous systems. As foundation models continue to scale, they introduce new challenges in resource management, particularly in data centers, and data availability prompting us to broaden our exploration of leveraging distributed and on-device resources for training and inference. Small Language Models (SLMs) are emerging as a compelling alternative for generative AI, particularly at the edge, offering a sustainable balance between efficiency and user privacy. This workshop aims to bring together algorithms and systems experts to discuss the opportunities and challenges of on-device machine learning. We hope to explore to what extent SLMs can compete with or complement LLMs and identify methods to enhance their quality and efficiency. Addressing this shift requires innovation in algorithm and system co-design, underscoring the importance of interdisciplinary approaches for future applications.

Exploration in AI Today (EXAIT)

Workshop

Parnian Kassraie · Andrew Wagenmaker · Bhavya · Carmelo Sferrazza · Lenart Treven · Amy X. Lu · Olivier Bachem · Kevin Jamieson · Andreas Krause · Pieter Abbeel

Abstract

How can we efficiently collect observations for optimization, control, and generalization? This is a key challenge in AI and is known as the exploration problem. Effective exploration has driven progress in areas such as robotics, recommender systems, and scheduled medical trials. However, as we address larger, more complex applications—such as drug discovery or language modeling—the exceptionally large search spaces render traditional exploration algorithms ineffective. As a result, recent breakthroughs in AI have come not from traditional exploration algorithms, but largely from training large foundation models on diverse corpora of pre-existing, curated datasets. Despite this, we have witnessed sparks showing that exploration, when done right, can compensate for data and computation—for example, in the training of DeepSeek-R1—suggesting that exploration can still play a key role in AI today. The Exploration in AI Today (EXAIT) Workshop at ICML 2025 will focus on addressing the evolving role of exploration in AI. We will dwell on the question: what is the place of exploration in today’s AI landscape and in which settings can exploration algorithms address current open challenges? In particular, we consider the potentially pivotal role that exploration might play in navigating complex and high-dimensional search spaces across real-world applications such as robotics, …

ICML 2025 Workshop on Multi-modal Foundation Models and Large Language Models for Life Sciences

Workshop

Pengtao Xie · James Zou · Le Song · Aidong Zhang · Danielle Grotjahn · Linda Awdishu · Eran Segal · Wei Wang

Abstract

Recent advances in foundation models and large language models (LLMs) have revolutionized life sciences by enabling AI-driven insights into complex biological systems. However, most existing models focus on single-modal data, limiting their ability to capture the inherently multi-modal nature of biological processes. This workshop will explore the development and application of multi-modal foundation models and LLMs that integrate diverse biological data types, such as protein sequences, structures, genomic and transcriptomic data, and metabolomics. By bringing together researchers from machine learning, computational biology, and biomedical sciences, the workshop will address challenges in modality fusion, cross-modal representation learning, scalable pretraining, and interpretability. Discussions will focus on novel architectures, self-supervised learning methods, and real-world applications in drug discovery, precision medicine, and multi-omics data analysis. Through invited talks, poster sessions, contributed presentations, and panel discussions, this workshop aims to advance multi-modal foundation models and LLMs for biological discovery and foster interdisciplinary collaborations that push the boundaries of machine learning in life sciences.

Workshop on Computer Use Agents

Workshop

David Barber · Doina Precup · Andrei-Cristian Nica · Roberta Raileanu · Doina Precup · Boyuan Zheng · Shuyan Zhou

Abstract

Computer use models are attracting significant interest in academia and industry due to their ability to perform complex tasks in non-deterministic environments. However, they are far from being ready for unattended deployment, as evidenced by their performance on the OSWorld benchmark where they achieve only a small fraction of human performance. The rapid evolution of these agents raises important questions regarding their accuracy, safe deployment, and potential impact on the future of work. The topics we would like to cover are: - Learning Algorithms --- which new architectures and learning techniques (e.g. memory mechanisms for extended tasks, exploration strategies) can enhance the intrinsic ability of computer use agents to acquire, represent, and refine knowledge? - Orchestration --- what novel frameworks or control methods (e.g. dynamic task planning, modular coordination, multi-agent systems) can efficiently manage and integrate multiple learning components to optimize overall agent performance? - Interfaces --- how should agents perceive and act within their environments (e.g., via APIs or UI interactions), and should we design unified systems or specialized agents for different modalities? - Guardrails, safety \& societal implications --- what guardrails do we need in order to make computer use models safe for deployment

in the wild'' while …

ICML 2025 Workshop on Collaborative and Federated Agentic Workflows (CFAgentic @ ICML'25)

Workshop

Alexander Erben · Gauri Joshi · Nicholas Lane · Huan Sun · Shiqiang Wang · Herbert Woisetschlaeger

Abstract

This workshop aims to provide a platform for discussing the convergence of collaborative and federated learning with agentic workflows - an emerging class of AI systems capable of autonomously executing complex task sequences. We aim to facilitate an engaging discussion among scholars and practitioners by soliciting work addressing key challenges in precision, efficiency, and personalization, safety & security, as well as regulatory compliance in the development of collaborative and federated agentic workflows.

Actionable Interpretability

Workshop

Tal Haklay · Hadas Orgad · Anja Reusch · Marius Mosbach · Sarah Wiegreffe · Ian Tenney · Mor Geva

Abstract

Interpretability research has advanced considerably in uncovering the inner mechanisms of artificial intelligence (AI) systems and has become a crucial subfield within AI. However, translating interpretability findings into actionable improvements in model design, training, and deployment remains a challenge. As a result, such insights have rarely influenced real-world AI development. This workshop addresses a key yet underexplored question: How can interpretability research drive tangible advancements in AI systems? By fostering discussions on the practical applications of interpretability, we aim to bridge this gap and highlight work that moves beyond analysis to achieve concrete improvements in model alignment, robustness, and domain-specific performance. Through this workshop, we strive to refocus interpretability research on actionable impact rather than just analysis, ensuring its insights lead to meaningful advancements in AI.

Scaling Up Intervention Models

Workshop

Jiaqi Zhang · Jialin Yu · Niki Kilbertus · Cheng Zhang · Caroline Uhler · Ricardo Silva

Abstract

Machine learning and AI have long been concerned about modeling how an agent can change the world around it. However, intervening in the physical world takes effort, leading to sparsity of evidence and the corresponding gaps of credibility when an agent considers carrying out previously unseen actions. Making the most of sparse data within a combinatorial explosion of possible actions, dose levels, and waiting times requires careful thinking, akin to efforts for introducing more compositionality principles into machine learning (Andreas, 2019). The goal of this workshop is to bring together state-of-the-art ideas on how to predict effects of novel interventions and distribution shifts by exploiting original ways of composing evidence from multiple data-generation regimes.

Workshop on Technical AI Governance

Workshop

Benjamin Bucknall · Lisa Soder · Weiwei Pan · Carlos Mougan · Siddharth Swaroop · Fazl Barez · Anka Reuel · Michael A Osborne · Robert Trager

Abstract

As the development and use of AI systems expands, policymakers increasingly recognize the need for targeted actions that promote beneficial outcomes while mitigating potential harms. Yet there is often a gap between these policy goals and the technical knowledge required for effective implementation, risking ineffective or actively harmful results (Reuel et al., 2024b). Technical AI governance—a nascent field focused on providing analyses and tools to guide policy decisions and enhance policy implementation—currently lacks sufficient venues for exchanging scholarly work. This workshop aims to provide such a venue, fostering interdisciplinary dialogue between machine learning researchers and policy experts by ensuring each submission is reviewed by both technical and policy specialists. Through this collaboration, we seek to accelerate the development of robust governance strategies that lead to safer, more equitable AI systems.

Machine Learning for Wireless Communication and Networks (ML4Wireless)

Workshop

Eleonora Grassucci · Jihong Park · Danilo Comminiello · Zihan Chen · Xinyu Gong · Yansha Deng · Xueyan Niu

Abstract

As wireless communication systems evolve to meet the demands of a hyper-connected world, artificial intelligence models are emerging as the driving force behind a new wave of technological innovation. This workshop will explore how state-of-the-art artificial intelligence and machine learning (ML) methods are poised to redefine the core of wireless networks providing solutions to old and new communication challenges. One of the central themes is semantic communication, where ML enables wireless networks to understand and transmit the meaning behind data, rather than the whole bitstream, drastically improving efficiency in bandwidth-constrained environments and presenting novel scenarios and possible applications that were not even conceivable a couple of years ago. Additionally, the rise of generative and language models for wireless communication is bringing new ways to compress and enhance signal transmissions, impacting several downstream applications such as autonomous driving, video streaming, and virtual reality. Concurrently with widening the range of applications, these models also bring novel challenges related to large models' computational demands or to the regenerated content's controllability and reliability. Central to bridging ML and wireless communication is the study of inverse problems, where generative models play a pivotal role in reconstructing lost or incomplete signals, and solving ill-posed tasks inherent …

DataWorld: Unifying data curation frameworks across domains

Workshop

Neha Hulkund · Sara Beery · Benjamin Feuer · Niv Cohen · Thao Nguyen · Ludwig Schmidt · Serena Yeung · Yuhui Zhang

Abstract

Recently, data-centric research, which has historically taken a backseat to model-centric research, has assumed a central role in the machine learning community. Our workshop aims to explore data-centric methods and theory, with a particular emphasis on real-world data curation. By curation, we mean the set of actions taken by some curator(s) to transition from ideation to a complete dataset. Our topic is wide-ranging, with recent work studying everything from sourcing, to benchmarks. One area that remains relatively underexplored is how data-centric methods can perform differently, depending on the modality and the domain of the data and the downstream application. Which lessons can be shared across domains and modalities, and which cannot? For example, a common part of the data pipeline involves data filtration. Filtration, in domains like medical imaging and wildlife camera traps, faces similar challenges including long-tailed distributions and natural distribution shifts (between hospitals and camera locations, respectively). However, the two domains differ in the types of distribution shift encountered (covariate vs. label vs. subpopulation) and dataset scale (there are generally more camera trap images than medical scans). Another example is the fact that most successful filtration methods in the recent DataComp benchmark tend to disproportionately remove images with …

Assessing World Models: Methods and Metrics for Evaluating Understanding

Workshop

Keyon Vafa · Belinda Li · Kenneth Li · Michael Lepori · Hao Tang · Catherine Wong

Abstract

Generative models across domains are capable of producing outputs that appear to mimic the real world. But have these systems actually understood the laws that govern the world? Researchers across subfields are attempting to answer this question: in natural language processing, researchers measure whether LLMs understand real-world mechanisms in order to measure how robust they are to new tasks; in video generation, researchers assess whether a model has understood the laws of physics in order to evaluate how realistic its videos are; in scientific domains, foundation models are being developed in order to uncover new theories about the world. Despite studying similar questions, these communities remain disparate. This workshop will explore the question: how can we formalize and evaluate whether generative models have understood the real world? While this question is important across communities, we don’t have unified frameworks for defining and evaluating world models. This workshop will bring together these computer science communities along with non-computer-science scientists working on relevant applications. Our invited speakers include Jacob Andreas, Shiry Ginosar, Shirley Ho, Sendhil Mullainathan, and Martin Wattenberg, all of whom have confirmed they will be speaking and that they can make it in-person.

2nd Generative AI for Biology Workshop

Workshop

Aditi Krishnapriyan · Lei Li · Maria Brbic · Minkai Xu · Regina Barzilay · Seul Lee · Stefano Ermon · Tingyang Yu · Wenxian Shi · Zhenqiao Song

Abstract

The 2024 Nobel Prize in Chemistry was awarded to AI-based protein structure prediction and protein design. It highlights the immense potential of AI in basic science and health research. In the meantime, generative AI models such as large language models (LLMs) and diffusion models are acquiring impressive capabilities in generating language, creating artwork, solving complex reasoning problems, writing computer programs, etc. To further facilitate the dialog between machine learning and biology, we propose to organize a workshop at ICML 2025, focusing on generative AI for biological discovery and therapeutic design. By fostering connections among preeminent researchers from both industry and academia, we aim to gain critical insights into the future of generative-AI-driven biology. Moreover, we hope to bridge the gap between machine learning and biological disciplines by focusing on three central themes that encapsulate innovative research as well as practical implications, which span both cutting-edge research and translational impact.

ES-FoMo III: 3rd Workshop on Efficient Systems for Foundation Models

Workshop

Daniel Y Fu · Max Ryabinin · Max Ryabinin · Daniel Hesslow · Simran Arora · Songlin Yang · Dan Biderman · Tri Dao · Beidi Chen · Azalia Mirhoseini · Percy Liang

Abstract

As models increase in size and training budget, they not only systematically improve in upstream quality, but also exhibit novel emergent capabilities, unlocking new AI applications. These new capabilities have led to a paradigm shift: large foundation models have become predominant in natural language processing and are growing increasingly common in computer vision, audio processing and even robotics. This increase in scale raises proportionate difficulties for practitioners: foundation model training and inference lie at a unique interdisciplinary crossroad, combining open problems in algorithms, system design, and software engineering. In response to these challenges, diverse research directions have spawned promising works: (1) training and inference either at large scale or in resource-constrained scenarios (e.g., with higher network latency and lower bandwidth, in a collaborative manner across a fleet of contributed devices, or with a single GPU); (2) large-scale distributed training approaches, such as 3D parallelism and sharding; and (3) deep system optimizations, with custom languages such as TVM and Triton. These novel interdisciplinary research directions directly shape and impact the trajectory of research across machine learning. Accordingly, these emerging lines of research are increasingly relevant to machine learning researchers. Indeed, researchers are key stakeholders: on the one hand, researchers may contribute …

The 2nd Workshop on Reliable and Responsible Foundation Models

Workshop

Mohit Bansal · Xinyu Yang · Kate Donahue · Giulia Fanti · David Madras · Han Shao · Hongyi Wang · Steven Wu · Xinyu Yang · Huaxiu Yao

Abstract

Foundation models (FMs), with their emergent and reasoning abilities, are reshaping the future of scientific research and broader human society. However, as their intelligence approaches or surpasses that of humans, concerns arise regarding their responsible use in real-world applications, such as reliability, safety, transparency, and ethics. The workshop on reliable and responsible FMs delves into the urgent need to ensure that such models align with human values. The significance of this topic cannot be overstated, as the real-world implications of these models impact everything from daily information access to critical decision-making in fields like medicine and finance, especially for embodied FMs that directly interact with the physical world. Stakeholders, including developers, practitioners, and policymakers, care deeply about this because the reliable and responsible design, deployment, and oversight of these models dictate not only the success of AI solutions but also the preservation of societal norms, order, equity, and fairness. Some of the fundamental questions that this workshop aims to address are: * **Diagnosis:** How can we identify and characterize unreliable and irresponsible behaviors in FMs? Topics include prompt sensitivity, lack of self-consistency, and hallucinations in generation. * **Evaluation:** How should we assess the harmful capabilities of FMs and quantify their …

AI Heard That! ICML 2025 Workshop on Machine Learning for Audio

Workshop

Alice Baird · Sander Dieleman · Chris Donahue · Brian Kulis · David Liu · Rachel Manzelli · Shrikanth Narayanan

Abstract

The Machine Learning for Audio workshop at ICML 2025 will cover a broad range of tasks and challenges involving audio data. These include, but are not limited to: methods of speech modeling, environmental sound generation or other forms of ambient sound, novel generative models, music generation in the form of raw audio, text-to-speech methods, denoising of speech and music, data augmentation, classification of acoustic events, transcription, source separation, and multimodal problems.

2nd Workshop on Test-Time Adaptation: Putting Adaptation to the Test (PAT)

Workshop

Evan Shelhamer · Evgenia Rusak · Steffen Schneider · Francesco Croce · Motasem Alfarra · Teresa Yeo · Sarthak Kumar Maharana · Yunhui Guo · Marc Masana

Abstract

Deep learning has advanced by scaling datasets, models, and training computation. At the same time applications have broadened to many kinds of data (personal, scientific, …) and deployments (in clouds, on cars, …). Will these all be solved by more data, parameters, and training? Test-time updates are complementary, and can help on both foundation model servers and edge devices. This workshop examines train-time vs. test-time updates across scales by test-time adaptation, continual learning, in-context learning, and post-training editing.

ICML 2025 Workshop on Computational Optimization of Buildings (CO-BUILD)

Workshop

Judah A Goldfeder · Philippe Wyder · J. Nathan Kutz · John Sipple · Victoria Dean · Hod Lipson · Na Li · Bing Dong

Abstract

Buildings account for 37% of US carbon emissions, and roughly 15% of global energy consumption. Despite transformative advances in AI across industries, most buildings today still operate as they did 40 years ago. Building optimization research is ongoing but mostly confined to dedicated conferences such as BuildSys or ASHRAE, limiting ML community engagement. Despite many open research problems, the potential for ML-driven impact on climate change, in both the short and long term, is enormous, perhaps more so than any other optimization problem. This workshop will engage the ML community in tackling building optimization and fostering collaboration across ML, HVAC, dynamic systems, and smart buildings. Our confirmed speakers from the smart buildings community, the ML community, and its emerging intersection will lead discussions around benchmarking standards, methodologies, and scalable solutions. Reducing the carbon footprint of buildings provides a direct way for the ICML community to benefit humanity and offset the carbon cost of AI. To engage the community, we will host a $10,000 Kaggle competition, where competitors predict building dynamics on an unreleased, real-world dataset from multiple buildings. The contest winner(s) will be invited to give a talk and share their methods.

1st ICML Workshop on Foundation Models for Structured Data (FMSD @ ICML 2025)

Workshop

Nick Erickson · Xiyuan Zhang · Abdul Fatir Ansari · Boran Han · Mononito Goswami · Samuel Gabriel Müller · Lennart Purucker · Yuyang Wang · Christos Faloutsos · Michael Mahoney

Abstract

Structured data foundation models are an emerging area of research undergoing rapid growth, yet they still remain critically under-explored relative to image and text modalities. So far, the different structured data sub-communities have had little opportunity to come together and share insights about how to build foundation models for structured data. Yet, strong synergies exist across modalities since models share similar pre-training and in-context learning paradigms. Furthermore, models trained on one modality can also demonstrate promising predictive performance in another. This workshop brings together the tabular and time series communities to jointly discuss foundation models for structured data, enabling the communities to capitalize on their synergies. We aim for advancements in foundation models that unify structured data modalities, addressing challenges of scalability and generalization across real-world applications. This emerging field promises to transform how we approach structured data analysis and drive new opportunities across various domains.

Methods and Opportunities at Small Scale (MOSS)

Workshop

Bingbin Liu · Enric Boix-Adserà · Elisabetta Cornacchia · Surbhi Goel · Abhishek Panigrahi · Eran Malach · Cyril Zhang · Benjamin Edelman

Abstract

As current machine learning research pushes towards massive scales, this workshop takes a complementary approach by highlighting methods and opportunities for fundamental understanding and algorithmic advances using resource-constrained, small-scale experiments (<= 1 GPU).

Building Physically Plausible World Models

Workshop

Homanga Bharadhwaj · Boyuan Chen · Yilun Du · Hiroki Furuta · Ruiqi Gao · Hamidreza Kasaei · Sean Kirmani · Kuang-Huei Lee · Ruoshi Liu · Zeyi Liu · Li Fei-Fei · Carl Vondrick · Wenhao Yu

Abstract

The goal of this workshop is to exchange ideas and establish communications among researchers working on building gener- alizable world models that describe how the physical world evolves in response to interacting agents (e.g. human and robots). Large-scale datasets of videos, images, and text hold the key for learning generalizable world models that are visually plau- sible. However, distilling useful physical information from such diverse unstructured data is challenging and requires careful attention to data curation, developing scalable algorithms, and implementing suitable training curricula. On the other hand, physics-based priors can enable learning plausible scene dynamics but it is difficult to scale to complex phenomenon that lack efficient solvers or even governing dynamic equations. Developing general world models that can simulate complex real-world phenomenon in a physically-plausible fashion can unlock enormous opportunities in generative modeling and robotics, and would be of wide interest to the larger AI community, and we believe this workshop falls at an ideal timing given recent signif- icant progress in both video-modeling models and physics-based simulation. This workshop aims to bring together researchers in machine learning, robotics, physics-based simulation, and computer vision broadly aspiring to build scalable world models by utilizing internet data, simulation, and beyond …

2nd AI for Math Workshop @ ICML 2025

Workshop

Yinya Huang · Zhicheng Yang · Xiaodan Liang · Zhengying Liu · Swaroop Mishra · Mateja Jamnik · Kun Zhang · Isabelle Guyon · Isabelle Guyon · Marina Vinyes

Abstract

Mathematical reasoning stands as a pinnacle of human intelligence. The rapid advancements in artificial intelligence, particularly in large language models (LLMs), have opened new frontiers at the intersection of AI and mathematical reasoning. This workshop aims to explore the potential of AI in comprehending and advancing mathematical reasoning, with a focus on fostering collaboration between humans and machines to push the boundaries of mathematical discovery. The central theme revolves around the question: >*

How can we leverage and advance the mathematical reasoning abilities of machine learning models, and drive innovation across scientific and practical domains?''* Our workshop will bring together researchers from diverse backgrounds, institutions, and disciplines to discuss the progress and future of AI technologies in mathematics. Specifically, we will delve into the areas related to the following: * **Automated Theorem Proving**: How can we build consistent theorem-proving systems? How can theorem-proving systems assist humans through human-computer interaction? * **Automated Theorem Generation**: Can neural models generate new and practically meaningful theorems that have been discovered? How can we utilize these newly generated theorems? * **Autoformalization and Verification**: How can we improve the precision of translating natural language proofs into formal proofs, and vice versa? * **Problem Solving**: How can we …

The 1st Workshop on Vector Databases

Workshop

Martin Aumüller · Harsha Vardhan Simhadri

Abstract

Vector databases (Vector DBs) are a foundational and critical application layer for injecting information into large language models (LLMs). Although different companies have proposed various vector databases, no academic workshop has previously existed to discuss these systems comprehensively. This workshop aims to foster discussions on vector databases from various perspectives, ranging from mathematical theories to implementation-level optimizations. Topics covered in the workshop include retrieval-augmented generation (RAG), algorithms and data structures for approximate nearest neighbor search (ANN), data management systems for handling vector data, query languages, and embedding models. Furthermore, the workshop will also function as a platform for companies and researchers working on vector databases to present technical details (white papers) and exchange ideas.

The Second Workshop on Long-Context Foundation Models

Workshop

Zexue He · Tianyu Gao · Amanda Bertsch · Yuandong Tian · Danqi Chen · Graham Neubig · Rogerio Feris

Abstract

Foundation models have become a cornerstone in the advancement of artificial intelligence, enabling applications across a wide range of domains. Many complex tasks today require processing and synthesizing information over thousands to millions of individual pieces of data, from text and images to audio and genomic sequences. Recent progress in long-context models has made it possible to handle such extensive inputs, but significant challenges remain, particularly in terms of computational efficiency, data quality and quantity, and evaluation. This workshop will convene researchers to explore these challenges and foster developments in long-context foundation models. Key topics include new modeling architectures, training approaches, efficiency techniques, and comprehensive evaluation methods. Additionally, in this edition, special attention will be given to long-context reasoning, multimodal learning, and applications in scientific fields such as genomics, climate science, etc. By tackling these critical challenges, we aim to push the boundaries of long-context modeling and shape its future directions.

The Impact of Memorization on Trustworthy Foundation Models

Workshop

Franziska Boenisch · Adam Dziedzic · Dominik Hintersdorf · Lingjuan Lyu · Niloofar Mireshghallah · Lukas Struppek

Abstract

Foundation models have come to underpin many critical applications, such as healthcare, public safety, and education. Ensuring their trustworthiness is, therefore, more important than ever. However, recent research has revealed that foundation models are prone to memorizing details or even entire samples from their training data. This issue can lead to privacy violations, intellectual property infringement, and societal harm when sensitive information is leaked. While unintended memorization risks the integrity of models, a certain degree of it is essential for solving novel and complex tasks, highlighting the importance of balancing performance with data leakage. Currently, isolated solutions are being developed across various research fields and data modalities, often without integration or coordination. This fragmentation can lead to duplicated efforts despite shared goals. The lack of interaction and exchange between research fields hinders progress in understanding and mitigating undesired memorization. In this workshop, we explore the causes and consequences of memorization from both theoretical and practical perspectives. We aim to connect insights from different research fields, including data privacy, ethics, and security in machine learning, to assess their impact on models and society and to explore innovative methods for mitigating associated risks. By bringing together researchers and practitioners from diverse fields, …

TerraBytes: Towards global datasets and models for Earth Observation

Workshop

Nicolas Audebert · Hossein Azizpour · Valentin Barriere · Javiera Castillo Navarro · Mikolaj Czerkawski · Heng Fang · Alistair Francis · Valerio Marsocci · Andrea Nascetti · Ritu Yadav

Abstract

Earth Observation presents unique challenges for machine learning due to its non-stationary data distribution, spatio-temporal biases, and multimodal nature. TerraBytes aims to address these challenges by fostering discussions at the intersection of data curation, machine learning, and remote sensing. The workshop focuses on (1) curating less biased, globally representative EO datasets, (2) developing adaptable ML models for EO applications, and (3) bridging the gap between data acquisition and ML communities. By promoting interdisciplinary collaboration, TerraBytes seeks to advance EO research and enable inclusive, fair, and impactful applications.

Multi-Agent Systems in the Era of Foundation Models: Opportunities, Challenges and Futures

Workshop

Zhenfei Yin · Yue Hu · Siheng Chen · Bernadette Bucher · Juan Carlos Niebles · Dawn Song · Phil Torr

Abstract

The scaling of model parameters has unlocked the groundbreaking capabilities of foundation models. Likewise, in human society, scaling and collaboration across individuals, organizations, companies, and nations amplify collective intelligence to unprecedented levels, enabling remarkable achievements that would be impossible for individuals alone, such as space exploration and autonomy. Could this principle of scaling~\cite{kaplan2020scaling} also apply to the growth in the number of agents? Multi-agent systems may offer a promising path forward. By progressively integrating more agents, multi-agent systems can activate diverse functionalities within these foundation model-powered generalist agents and coordinate a broader range of complementary functionalities. This synergy fosters improved problem-solving, adaptability, and decision-making capabilities. As the multi-agent system scales, it has a huge potential to achieve enhanced capabilities and tackle increasingly complex tasks, offering a promising solution toward the ultimate goal of achieving artificial general intelligence (AGI).

WoTok: Workshop on Tokenization at ICML 2025

Workshop

Tomasz Limisiewicz · Valentin Hofmann · Sachin Kumar · Farhan Samir · Jindřich Libovický · Orevaoghene Ahia · Elizabeth Salesky

Abstract

Tokenization defines how data are represented as input and output for many current machine learning systems, including language models. Tokenization has been shown to significantly affect the utility and effectiveness of these models (Mielke et al., 2021). This finding has stirred considerable interest in tokenization as a research direction in machine learning and its subfields, such as natural language processing, but currently, there is no venue specifically dedicated to it. Our initiative—WoTok (Workshop on Tokenization)—aims to fill this gap and will focus on tokenization in a broad sense.

2nd Workshop on Models of Human Feedback for AI Alignment (MoFA)

Workshop

Belen Martin Urcelay · Micah Carroll · Maria Teresa Parreira · Thomas Kleine Buening · Andreas Krause · Anca Dragan

Abstract

Our workshop brings together experts in machine learning, cognitive science, behavioral psychology, and economics to explore human-AI alignment by examining human (and AI) feedback mechanisms, their mathematical models, and practical implications. By fostering collaboration between technical and behavioral science communities, we aim to develop more realistic models of human feedback that can better inform the development of aligned AI systems.