Hamish Flynn · Maxime Heuillet · Audrey Durand · Melih Kandemir · Benjamin Guedj

Interactive learning encompasses online learning, continual learning, active learning, bandits, reinforcement learning, and other settings where an algorithm must learn while interacting with a continual stream of data. Such problems often involve exploration-exploitation dilemmas, which can be elegantly handled with probabilistic and Bayesian methods. Deep interactive learning methods leveraging neural networks are typically used when the setting involves rich observations, such as images. As a result, both probabilistic and deep interactive learning methods are growing in popularity. However, acquiring observations in an interactive fashion with the environment can be costly. There is therefore great interest in understanding when sample-efficient learning with probabilistic and deep interactive learning can be expected or guaranteed. Within statistical learning theory, PAC-Bayesian theory is designed for the analysis of probabilistic learning methods. It has recently been shown to be well-suited for the analysis of deep learning methods. This workshop aims to bring together researchers from the broad Bayesian and interactive learning communities in order to foster the emergence of new ideas that could contribute to both theoretical and empirical advancement of PAC-Bayesian theory in interactive learning settings.

Weina Jin · Ramin Zabih · S. Kevin Zhou · Yuyin Zhou · Xiaoxiao Li · Yifan Peng · Zongwei Zhou · Yucheng Tang · Yuzhe Yang · Agni Kumar

Applying machine learning (ML) in healthcare is gaining momentum rapidly. However, the black-box characteristics of the existing ML approach inevitably lead to less interpretability and verifiability in making clinical predictions. To enhance the interpretability of medical intelligence, it becomes critical to develop methodologies to explain predictions as these systems are pervasively being introduced to the healthcare domain, which requires a higher level of safety and security. Such methodologies would make medical decisions more trustworthy and reliable for physicians, which could ultimately facilitate the deployment. In addition, it is essential to develop more interpretable and transparent ML systems. For instance, by exploiting structured knowledge or prior clinical information, one can design models to learn aspects more aligned with clinical reasoning. Also, it may help mitigate biases in the learning process, or identify more relevant variables for making medical decisions.In this workshop, we aim to bring together researchers in ML, computer vision, healthcare, medicine, NLP, public health, computational biology, biomedical informatics, and clinical fields to facilitate discussions including related challenges, definition, formalisms, and evaluation protocols regarding interpretable medical machine intelligence. Our workshop will be in a large-attendance talk format. The expected number of attendees is about 150. The workshop appeals to ICML …

Dinghuai Zhang · Yuanqi Du · Chenlin Meng · Shawn Tan · Yingzhen Li · Max Welling · Yoshua Bengio

The workshop focuses on theory, methodology, and application of structured probabilistic inference and generative modeling, both of which are important topics in machine learning.Specifically, probabilistic inference addresses the problem of amortization,sampling, and integration of complex quantities from graphical models, while generative modeling captures the underlying probability distributions of a dataset. Apart from applications in computer vision, natural language processing, and speech recognition, probabilistic inference and generative modeling approaches have also been widely used in natural science domains, including physics, chemistry, molecular biology, and medicine. Despite the promising results, probabilistic methods face challenges when applied to highly structured data, which are ubiquitous in real-world settings, limiting the applications of such methods. This workshop aims to bring experts from diverse backgrounds and related domains together to discuss the applications and challenges of probabilistic methods. The workshop will emphasize challenges in encoding domain knowledge when learning representations, performing inference and generations. By bringing together experts from academia and industry, the workshop will provide a platform for researchers to share their latest results and ideas, fostering collaboration and discussion in the field of probabilistic methods.

Valentin De Bortoli · Maxim Raginsky · Animashree Anandkumar · Guan-Horng Liu · Pratik Chaudhari · Melanie Zeilinger · Tianrong Chen · Charlotte Bunne

Recent advances in algorithmic design and principled, theory-driven deep learning architectures have sparked a growing interest in control and dynamical system theory. Complementary, machine learning plays an important role in enhancing existing control theory algorithms in terms of performance and scalability. The boundaries between both disciplines are blurring even further with the rise of modern reinforcement learning, a field at the crossroad of data-driven control theory and machine learning. This workshop aims to unravel the mutual relationship between learning, control, and dynamical systems and to shed light on recent parallel developments in different communities. Strengthening the connection between learning and control will open new possibilities for interdisciplinary research areas.

Zheng Xu · Peter Kairouz · Bo Li · Tian Li · John Nguyen · Jianyu Wang · Shiqiang Wang · Ayfer Ozgur

Proposed around 2016 as privacy preserving techniques, federated learning and analytics (FL & FA) made remarkable progress in theory and practice in recent years. However, there is a growing disconnect between theoretical research and practical applications of federated learning. This workshop aims to bring academics and practitioners closer together to exchange ideas: discuss actual systems and practical applications to inspire researchers to work on theoretical and practical research questions that lead to real-world impact; understand the current development and highlight future directions. To achieve this goal, we aim to have a set of keynote talks and panelists by industry researchers focused on deploying federated learning and analytics in practice, and academic research leaders who are interested in bridging the gap between the theory and practice.

Nezihe Merve Gürel · Bo Li · Theodoros Rekatsinas

Thinking fast and automatic vs. slow and deliberate (respectively System I and II) is a popular analogy when comparing data-driven learning to the good old-fashion symbolic reasoning approaches. Underlying this analogy lies the different capabilities of both systems, or lack thereof. On the one hand, data-driven learning (System I) has striking performance advantages over symbolic reasoning (System II) but lacks abilities such as abstraction, comprehensibility and contextual awareness. On the other hand, symbolic reasoning tackles those issues but tends to lag behind data-driven learning when it comes to speedy, efficient and automated decision-making. In the current state of matters to combat issues on both sides, there is an increasing consensus among the machine learning and artificial intelligence communities to draw out the best of both worlds and unify data-driven approaches with rule-based, symbolic, logical and commonsense reasoning. This workshop aims to discuss emerging advances and challenges on this topic, in particular at the intersection of data-driven paradigms and knowledge and logical reasoning. We focus on both directions of this intersection:Knowledge and Logical Reasoning for Data-driven Learning: In this direction, we will investigate the role of rule-based, knowledge and logical reasoning to enable more deliberate and trustworthy data-driven learning. Data-driven Learning …

Yubin Xie · Cassandra Burdziak · Dana Pe'er · Debora Marks · Alexander Anderson · Elham Azizi · Abdoulaye Baniré Diallo · Wesley Tansey · Bianca Dumitrascu · Sandhya Prabhakaran · Maria Brbic · Mafalda Dias · Cameron Park · Pascal Notin · Joy Fan · Ruben Weizman · Lingting Shi · Siyu He · Yinuo Jin

Each year, machine learning (ML) advances are successfully translated to develop systems we now use regularly, such as speech recognition platforms or translation software. The COVID-19 pandemic has highlighted the urgency for translating these advances to the domain of biomedicine. Biological data has unique properties (high dimensionality, degree of noise and variability), and therefore poses new challenges and opportunities for methods development. To facilitate progress toward long-term therapeutic strategies or basic biological discovery, it is critical to bring together practitioners at the intersection of computation, ML, and biology.The ICML Workshop on Computational Biology (WCB) will highlight how ML approaches can be tailored to making both translational and basic scientific discoveries with biological data, such as genetic sequences, cellular features or protein structures and imaging datasets, among others. This workshop thus aims to bring together interdisciplinary ML researchers working in areas such as computational genomics; neuroscience; metabolomics; proteomics; bioinformatics; cheminformatics; pathology; radiology; evolutionary biology; population genomics; phenomics; ecology, cancer biology; causality; representation learning and disentanglement to present recent advances and open questions to the machine learning community. We especially encourage interdisciplinary submissions that might not neatly fit into one of these categories.See attached document for additional details

Julia Schnabel · Andreas Maier · Pallavi Tiwari · Oliver Stegle

This new workshop will bring together interdisciplinary scientists and practitioners working at the intersections of machine learning (ML) to medicine, pathology and biology, for presenting new methods and solutions for healthcare challenges across the full range of multimodal, and often highly heterogeneous and complex patient data, to the wider ICML community. Topics of interest include, but are not limited to:•Multimodal fusion and learning in medical imaging, digital pathology, computational biology, genetics, electronic healthcare records, …•Multimodal biomarkers for early prediction of disease onset, therapeutic response or disease recurrence•Benchmarking, domain shifts, and generalization of ML in multimodal healthcare data•ML for dealing with inherent sparsity, incompleteness and complexity of multimodal healthcare data•ML for ensuring fairness and reducing bias in healthcare applications•ML for privacy preservation in healthcare data•Co-creation and human-in-the-loop for ML in healthcare

David I. Inouye · Mengye Ren · Mateusz Malinowski · Michael Eickenberg · Gao Huang · Eugene Belilovsky

Despite being widely used, global end-to-end learning has several key limitations. It requires centralized computation, making it feasible only on a single device or a carefully synchronized cluster. This restricts its use on unreliable or resource-constrained devices, such as commodity hardware clusters or edge computing networks. As the model size increases, synchronized training across devices will become less efficient and will impact all types of parallelism. Even if computational parallelism is achieved, global learning also requires a large memory footprint, which is correlated with the monetary cost of training and limits the the learning capability of single devices. Moreover, end-to-end learning updates have high latency because of the large round-trip time, which may prevent their use in real-time applications such as learning on streaming video. Finally, the global backpropagation approach is thought to be biologically implausible, as biological synapses update in a local and asynchronous manner. To overcome these limitations, this workshop will delve into the fundamentals of localized learning. Broadly defined as any training method that updates model parts through non-global objectives, localized learning has the potential to develop highly decentralized, parallel, asynchronous, and fault-tolerant algorithms that can learn on heterogeneous hardware devices under dynamic conditions while maintaining comparable …

Andi Peng · Akanksha Saran · Andreea Bobu · Tengyang Xie · Pierre-Yves Oudeyer · Anca Dragan · John Langford

Systems that can learn interactively from their end-users are quickly becoming widespread in real-world applications. Typically humans provide tagged rewards or scalar feedback for such interactive learning systems. However, humans offer a wealth of implicit information (such as multimodal cues in the form of natural language, speech, eye movements, facial expressions, gestures etc.) which interactive learning algorithms can leverage during the process of human-machine interaction to create a grounding for human intent, and thereby better assist end-users. A closed-loop sequential decision-making domain offers unique challenges when learning from humans -– (1) the data distribution may be influenced by the choices of the algorithm itself, and thus interactive ML algorithms need to adaptively learn from human feedback, (2) the nature of the environment itself changes rapidly, (3) humans may express their intent in various forms of feedback amenable to naturalistic real-world settings, going beyond tagged rewards or demonstrations. By organizing this workshop, we attempt to bring together interdisciplinary experts in interactive machine learning, reinforcement learning, human-computer interaction, cognitive science, and robotics to explore and foster discussions on such challenges. We envision that this exchange of ideas within and across disciplines can build new bridges, address some of the most valuable challenges …

Nina Corvelo Benz · Ricardo Dominguez Olmedo · Manuel Gomez-Rodriguez · Thorsten Joachims · Amir Karimi · Stratis Tsirtsis · Isabel Valera · Sarah Wu

Had I left 5 minutes earlier, I would have caught the bus. Had I been driving slower, I would have avoided the accident. Counterfactual thoughts—“what if?” scenarios about outcomes contradicting what actually happened—play a key role in everyday human reasoning and decision-making. In conjunction with rapid advancements in the mathematical study of causality, there has been an increasing interest in the development of machine learning methods that support elements of counterfactual reasoning, i.e., they make predictions about outcomes that "could have been different". Such methods find applications in a wide variety of domains ranging from personalized healthcare and explainability to AI safety and offline reinforcement learning. Although the research at the intersection of causal inference and machine learning is blooming, there has been no venue so far explicitly focusing on methods involving counterfactuals. In this workshop, we aim to fill that space by facilitating interdisciplinary interactions that will shed light onto the three following questions: (i) What insights can causal machine learning take from the latest advances in cognitive science? (ii) In what use cases is each causal modeling framework most appropriate for modeling counterfactuals? (iii) What barriers need to be lifted for the wider adoption of counterfactual-based machine learning …

Ce Zhang · Praveen Paritosh · Newsha Ardalani · Nezihe Merve Gürel

This is the third edition of highly successful workshops focused on data-centric AI, following the success of the Data-Centric AI workshop at NeurIPS 2021 and DataPerf workshop at ICML 2022. Data, and operations over data (e.g., cleaning, debugging, curation) have been continually fueling the success of machine learning for decades. While historically the ML community has focused primarily on model development, recently the importance of data quality has attracted intensive interest from the community, including the creation of the NeurIPS dataset and benchmark track, several data-centric AI benchmarks (e.g., DataPerf), and the flourishing of data consortiums such as LAION, the community’s attention has been directed to the quality of data used for ML training and evaluation. The goal of this workshop is to facilitate these important topics in what we call Data-centric Machine Learning Research, which includes not only datasets and benchmarks, but tooling and governance, as well as fundamental research on topics such as data quality and data acquisition for dataset creation and optimization.

Yoav Wald · Claudia Shi · Aahlad Puli · Amir Feder · Limor Gultchin · Mark Goldstein · Maggie Makar · Victor Veitch · Uri Shalit

As machine learning models are introduced into every aspect of our lives, and potential benefits become abundant, so do possible catastrophic failures. One of the most common failure scenarios when deploying machine learning models in the wild, which could possibly lead to dire consequences in extreme cases, is the reliance of models on apparently unnatural or irrelevant features. The issue comes up in a variety of applications: from the reliance of detection models for X-rays on scanner types and marks made by technicians in the hospital, through visual question answering models being sensitive to linguistic variations in the questions, the list of examples for such undesirable behaviors keeps growing.In examples like these, the undesirable behavior stems from the model exploiting a spurious correlation.For ICML 2022, we organized the first workshop on Spurious Correlations Invariance and Stability (SCIS), to spark a discussion in the community on topics regarding spurious correlations and methods for treating them.We received 77 high-quality submissions, and the workshop was one of the most well-attended in the conference, seeing around 300 attendees. A clear conclusion from this experience is that work on spurious correlations is a long-term effort that spans communities such as fairness, causality-inspired ML, and domains …

Yang Li · Ranjay Krishna · Helena Vasconcelos · Bryan Wang · Forrest Huang

Artificial intelligence (AI) and Human Computer Interaction (HCI) share common roots: early work on conversational agents has laid the foundation for both fields. However, economic and political influences have driven these fields to remain separate in subsequent decades. The recent rise of data-centric methods in machine learning has propelled few-shot emergent AI capabilities, resulting in a raft of practical tools. In particular, modern AI techniques now power new ways for machines and humans to interact. Recently, a wave of HCI tasks have been proposed to the machine learning community, which direct AI research by contributing new datasets and benchmarks, and challenging existing modeling techniques, learning methodologies, and evaluation protocols. This workshop offers a forum for researchers to discuss these new research directions, identifying important challenges, showcasing new computational and scientific ideas that can be applied, sharing datasets/tools that are already available, or proposing those that should be further developed.

Jennifer Hu · Alane Suhr · Saujas Vaduguru · Chenghao Yang · Pei Zhou · Xuhui Zhou · Hao Zhu

Theory of Mind (ToM) is the ability to reason about the minds of other agents. The main theme of our workshop is the computational modeling of ToM, with a special focus on the role of natural language in such modeling. Specifically, ToM 2023 pays attention to cognitive foundations and theories of ToM, the acquisition and relationship between language and ToM, leveraging ToM to improve and explain NLP and ML models, and using ToM for positive social impact. This workshop intends to promote the community of researchers that are interested in improving the ability of intelligent agents to reason about others' mental states. Our proposed program provides a space to discuss pathways for understanding and applying ToM in psycholinguistics, pragmatics, human value alignment, social good, model explainability, and many other areas of NLP. ToM 2023 will be a full-day hybrid in-person/virtual workshop with several keynote speeches, and oral/poster/spotlight presentations, followed by a breakout discussion, panel discussion, and best paper award announcement. We also intend to host a mentoring program to broaden participation from a diverse set of researchers.

Haoran Sun · Hanjun Dai · Priyank Jaini · Ruqi Zhang · Ellen Vitercik

There have recently been new research trends in efficient discretesampling and optimization. we are organizing this workshop with the goal of 1) Sync up on the latest research progress in discrete sampling and optimization. 2) Discuss the limitation of the current methods and brainstorm the new paradigms of algorithms. 3) Connect to the applications in domains like language/protein modeling, physics simulation, and bio/chemical engineering, where the improved sampling/optimization in discrete space would help, and learn the current gap between the application needs and the capability of existing methods.We hope this workshop will be an excellent opportunity for presenting and discussing the new algorithms and applications with researchers and practitioners within or outside the domain of discrete sampling/optimization.

Thomas Möllenhoff · Zelda Mariet · Mathieu Blondel · Khan Emtiyaz

Duality is a pervasive and important principle in mathematics. Not only has it fascinated researchers in many different fields but it has also been used extensively in optimization, statistics, and machine-learning (ML), giving rise to powerful tools such as Fenchel duality in convex optimization, representer theorems in kernel methods and Bayesian nonparametrics, and dually-flat spaces in information geometry. Such applications have played an important role in the past, but lately we do not see much work on duality principles, especially in deep learning. For example, Lagrange duality can be useful for model explanation because it allows us to measure sensitivity of certain perturbations, but this is not yet fully exploited. This slowdown is perhaps due to a growing focus on nonconvex and nonlinear problems where duality does not seem to be directly applicable. There have not been any workshops on duality in recent years. With this workshop, we aim to revive the interest of the ML community in duality principles.The goal of the workshop is to bring together researchers working on various duality concepts from many different fields, and discuss new applications for modern machine learning, especially focusing on topics such as model understanding, explanation, and adaptation in deep learning …

Francois Lanusse · Marc Huertas-Company · Brice Menard · Laurence Perreault-Levasseur · J. Xavier Prochaska · Uros Seljak · Francisco Villaescusa-Navarro · Ashley Villar

As modern astrophysical surveys deliver an unprecedented amount of data, from the imaging of hundreds of millions of distant galaxies to the mapping of cosmic radiation fields at ultra-high resolution, conventional data analysis methods are reaching their limits in both computational complexity and optimality. Deep Learning has rapidly been adopted by the astronomical community as a promising way of exploiting these forthcoming big-data datasets and of extracting the physical principles that underlie these complex observations. This has led to an unprecedented exponential growth of publications combining Machine Learning and astrophysics. Yet, many of these works remain at an exploratory level and have not been translated into real scientific breakthroughs.Following a successful initial iteration of this workshop at ICML 2022, our continued goal for this workshop series is to bring together Machine Learning researchers and domain experts in the field of Astrophysics to discuss the key open issues which hamper the use of Deep Learning for scientific discovery.

Aadirupa Saha · Mohammad Ghavamzadeh · Robert Busa-Fekete · Branislav Kveton · Viktor Bengs

Learning from human preferences or preference-based learning has been critical to major advancements in AI and machine learning. Since human beings are naturally more reliable at providing feedback on a relative scale compared to numerical values, collecting preference feedback is more budget-friendly and involves less bias. The broad objective of this workshop is twofold:1) Bring together different communities where preference-based learning has played a major role. This includes dueling bandits, multi-agent games, econometrics, social choice theory, reinforcement learning, optimization, robotics and many more, for which we aim to create a suitable forum to exchange techniques, ideas, learn from each other and potentially create new and innovative research questions. 2) Connect theory to practice by identifying real-world systems which can benefit from incorporating preference feedback, such as marketing, revenue management, search engine optimization, recommender systems, healthcare, language modeling, interactive chatbots, text summarization, robotics, and so on.We will consider our workshop a success if it inspires researchers to embark on novel insights in the general area of preference-based learning: Bringing attention from different communities to foster dissemination, cross-fertilization and discussion at scale. Especially, building bridges between experimental researchers and theorists towards developing better models and practical algorithms, and encouraging participants to propose, …

Julien Launay · Daniel Y Fu · Tri Dao · Daniel Hesslow · Beidi Chen · Azalia Mirhoseini · Percy Liang

As models increase in size and training budget, they not only systematically improve in upstream quality, but also exhibit novel emergent capabilities, unlocking new AI applications. These new capabilities have led to a paradigm shift: large foundation models have become predominant in natural language processing and are growing increasingly common in computer vision, audio processing and even robotics. This increase in scale raises proportionate difficulties for practitioners: foundation model training and inference lie at a unique interdisciplinary crossroad, combining open problems in algorithms, system design, and software engineering. The goal of this workshop is to bring together interdisciplinary experts working on the emerging research questions and challenges associated with foundation model training and inference. We welcome submissions around training and inference systems/algorithms for foundation models, focusing on scaling-up or on reducing compute, time, memory, bandwidth, and energy requirements. Notably, we encourage submissions concerning the entire spectrum of foundation models: from BERT-sized Transformers, to large models with 100B+ parameters.

Mark Müller · Brendon G. Anderson · Leslie Rice · Zhouxing Shi · Shubham Ugare · Huan Zhang · Martin Vechev · Zico Kolter · Somayeh Sojoudi · Cho-Jui Hsieh

As machine learning-based systems are increasingly deployed in safety-critical applications, providing formal guarantees on their trustworthiness becomes ever more important. To facilitate the investigation of this challenging problem, we propose the 2nd Workshop on Formal Verification of Machine Learning (WFVML). WFVML will raise awareness for the importance of the formal verification of machine learning systems, bring together researchers from diverse backgrounds with interest in the topic, and enable the discussion of open problems as well as promising avenues in this emerging research area. Building on the success of last year, WFVML features a diverse panel of 8 confirmed invited speakers who made foundational contributions to the young field and an experienced and diverse multi-institutional organizing team of 10, including pioneering proponents of machine learning verification. A schedule combining invited talks, contributed talks, poster sessions, and a panel will provide opportunities and input for open discussions, with remote participation enabled via Zoom. Please see our website for more details.

Sijia Liu · Pin-Yu Chen · Dongxiao Zhu · Eric Wong · Kathrin Grosse · Baharan Mirzasoleiman · Sanmi Koyejo

Given the success of AdvML-inspired research, we propose a new edition from our workshop at ICML’22 (AdvML-Frontiers’22), ‘The 2nd Workshop on New Frontiers in AdvML’ (AdvML-Frontiers’23). We target a high-quality international workshop, coupled with new scientific activities, networking opportunities, and enjoyable social events. Scientifically, we aim to identify the challenges and limitations of current AdvML methods and explore new prospective and constructive views for next-generation AdvML across the full theory/algorithm/application stack. As the sequel to AdvML-Frontiers’22, we will continue exploring the new frontiers of AdvML in theoretical understanding, scalable algorithm and system designs, and scientific development that transcends traditional disciplinary boundaries. We will also add new features and programs in 2023. First, we will expand existing research themes, particularly considering the popularity of large foundational models (e.g., DALL-E 2, Stable Diffusion, and ChatGPT). Examples of topics include AdvML for prompt learning, counteracting AI-synthesized fake images and texts, debugging ML from unified data-model perspectives, and ‘green’ AdvML towards environmental sustainability. Second, we will organize a new section, AI Trust in Industry, by inviting industry experts to introduce the practical trend of AdvML, technological innovations, products, and societal impacts (e.g., AI’s responsibility). Third, we will host a Show-and-Tell Demos in the poster …

Katherine Lee · A. Feder Cooper · David Mimno · Deep Ganguli · FatemehSadat Mireshghallah · James Grimmelmann

Progress in generative AI depends not only on better model architectures, but on terabytes of scraped Flickr images, Wikipedia pages, Stack Overflow answers, and websites. That is, generative models ingest vast quantities of intellectual property (IP), which they can memorize and regurgitate verbatim. Several recently-filed lawsuits relate such memorization to copyright infringement. These lawsuits will lead to policies and legal rulings that define our ability, as ML researchers and practitioners, to acquire training data, and our responsibilities towards data owners and curators. AI researchers will increasingly operate in a legal environment that is keenly interested in their work --- an environment that may require future research into model architectures that conform to legal requirements. As such, just as it is vital to inform courts and policymakers of the realities of AI work, ICML attendees must be well informed about law.Our workshop will begin to build a comprehensive and precise synthesis of the legal issues at play. Beyond IP, the workshop will also address privacy and liability for dangerous, discriminatory, or misleading and manipulative outputs. Addressing these challenges requires collaboration between ML researchers and practitioners, data curators, HCI researchers, and legal experts. We will mix tutorial-style presentations from renowned experts in …

Courtney Paquette · Zhenyu Liao · Mihai Nica · Elliot Paquette · Andrew Saxe · Rene Vidal

Modern applications of machine learning seek to extract insights from high-dimensional datasets. The goal of the Workshop on High-dimensional Learning Dynamics (HiLD) is to predict and analyze the dynamics of learning algorithms when the number of samples and parameters are large (i.e., high-dimensional). This workshop seeks to spur research and collaboration around:1. Developing analyzable models and dynamics to explain observed deep neural network phenomena;2. Creating mathematical frameworks for scaling limits of neural network dynamics as width and depth grow, which often defy low-dimensional geometric intuitions;3. The role of overparameterization and how this leads to conserved quantities in the dynamics and the emergence of geometric invariants, with links to Noether's theorem, etc;4. Provable impacts of the choice of optimization algorithm, hyper-parameters, and neural network architectures on training/test dynamics.HiLD Workshop aims to bring together experts from classical random matrix theory, optimization, high-dimensional statistics/probability, and statistical physics to share their perspectives while leveraging crossover experts in ML. It seeks to create synergies between these two groups which often do not interact. Through a series of talks, poster sessions, and panel discussions, the workshop will tackle questions on dynamics of learning algorithms at the interface of random matrix theory, high-dimensional statistics, SDEs, and ML.

Felix Petersen · Marco Cuturi · Mathias Niepert · Hilde Kuehne · Michael Kagan · Willie Neiswanger · Stefano Ermon

Gradients and derivatives are integral to machine learning, as they enable gradient-based optimization. In many real applications, however, models rest on algorithmic components that implement discrete decisions, or rely on discrete intermediate representations and structures. These discrete steps are intrinsically non-differentiable and accordingly break the flow of gradients. To use gradient-based approaches to learn the parameters of such models requires turning these non-differentiable components differentiable. This can be done with careful considerations, notably, using smoothing or relaxations to propose differentiable proxies for these components. With the advent of modular deep learning frameworks, these ideas have become more popular than ever in many fields of machine learning, generating in a short time-span a multitude of "differentiable everything", impacting topics as varied as rendering, sorting and ranking, convex optimizers, shortest-paths, dynamic programming, physics simulations, NN architecture search, top-k, graph algorithms, weakly- and self-supervised learning, and many more.

Antoine Wehenkel · Jörn Jacobsen · Emily Fox · Anuj Karpatne · Victoriya Kashtanova · Xuan Di · Emmanuel de Bézenac · Naoya Takeishi · Gilles Louppe

The ``SynS & ML'' workshop aims to be an interdisciplinary forum for world-recognized and interested researchers in the challenges of combining scientific and machine-learning (ML) models. While the interaction between Science and ML is a hot topic in recent years, a venue focused on the unification of scientific and ML modelling is missing. Based on this observation, the goal of this workshop is to gather together ML researchers eager to include scientific models into their pipelines, domain experts working on augmenting their scientific models with ML, and researchers looking for opportunities to incorporate ML in widely-used scientific models.

Berivan Isik · Yibo Yang · Daniel Severo · Karan Ullrich · Robert Bamler · Stephan Mandt

This workshop aims to address fundamental problems in the young but potentially highly impactful field of machine-learning-based data compression. In contrast to other workshops, which focus on practical compression performance on a rate/distortion trade-off, our goal is to encourage idea exchange on more fundamental issues in neural compression such as the role of quantization and stochasticity in compression algorithms, guaranteed distortion bounds, and more compute-efficient models. We aim to address these fundamental issues by bringing together researchers from diverse fields including deep generative modeling, information theory, and inference. We have confirmed expert speakers and panelists from each of these backgrounds.

Swami Sankaranarayanan · Thomas Hartvigsen · Camille Bilodeau · Ryutaro Tanno · Cheng Zhang · Florian Tramer · Phillip Isola

Generative modeling has recently gained massive attention given high-profile successes in natural language processing and computer vision. However, there remain major challenges in deploying generative models for real-world impact in domains like healthcare and biology. This is a challenging agenda that requires collaboration across multiple research fields and industry stakeholders. This workshop aims to advance such interdisciplinary conversations around challenges in deploying generative models – the lessons learned by deploying large language models could be impactful for high stakes domains like medicine and biology. Specifically, we will solicit contributions that prioritize (1) Multimodal capabilities in generative modeling, (2) Deployment-critical features in generative models such as Safety, Interpretability, Robustness, Ethics, Fairness and Privacy, and (3) Human facing evaluation of generative models. The topic of generative modeling is extremely relevant to the core audience of ICML. Modern generative models impact several fields outside machine learning and hence responsible deployment of such powerful algorithms has become a major concern of researchers in academia and industry alike. ICML, being the flagship conference of Machine learning, is the perfect place to facilitate this cross disciplinary sharing of knowledge.

Tegan Emerson · Henry Kvinge · Tim Doster · Bastian Rieck · Sophia Sanborn · Nina Miolane

Much of the data that is fueling current rapid advances in machine learning is high dimensional, structurally complex, and strongly nonlinear. This poses challenges for researcher intuition when they ask (i) how and why current algorithms work and (ii) what tools will lead to the next big break-though. Mathematicians working in topology, algebra, and geometry have more than a hundred years worth of finely-developed machinery whose purpose is to give structure to, help build intuition about, and generally better understand spaces and structures beyond those that we can naturally understand. Following on the success of the first TAG-ML workshop in 2022, this workshop will showcase work which brings methods from topology, algebra, and geometry and uses them to help answer challenging questions in machine learning. Topics include mathematical machine learning, explainability, training schemes, novel algorithms, performance metrics, and performance guarantees. All accepted papers will be included in an associated PMLR volume.

Hyundong Cho · Nayeon Lee · Ninareh Mehrabi · Hsuan Su · Ahmad Beirami · Hung-yi Lee · Jonathan May

The recent breathtaking progress made in generative natural language processing (NLP) has been propelled by large language models and innovative learning methods that intersects machine learning (ML) and NLP such as Reinforcement Learning with Human Feedback (RLHF), leading to the creation of impressive chatbots like ChatGPT. However their lack of groundedness, factuality, and interoperability with tools and custom APIs limits them to mostly creative endeavors due to low fidelity and reliability. On the contrary, digital assistants in the real world such as Siri, Alexa, and Google Assistant can interface with proprietary APIs, but they still cover a relatively narrow set of use cases that are mostly simple single-turn interactions. Through the combination of each of their strengths, the goal of deploying truly conversational and capable digital assistants that are also trustworthy seems tantalizingly close. What are the remaining challenges for this goal, and how can the ML and NLP communities come together to overcome them? The goal of this workshop is to bring together machine learning researchers and dialogue researchers from academia and industry to encourage knowledge transfer and collaboration on these central questions to discover ideas that can further expand the use cases of conversational AI. The ideal outcome …