ICML 2024
Skip to yearly menu bar Skip to main content


Workshop

ES-FoMo II: 2nd Workshop on Efficient Systems for Foundation Models

Julien Launay · Tri Dao · Daniel Y Fu · Max Ryabinin · Daniel Hesslow · Beidi Chen · Percy Liang

Lehar 2
[ Abstract ]
Fri 26 Jul, midnight PDT

As models increase in size and training budget, they not only systematically improve in upstream quality, but also exhibit novel emergent capabilities, unlocking new AI applications. These new capabilities have led to a paradigm shift: large foundation models have become predominant in natural language processing and are growing increasingly common in computer vision, audio processing and even robotics. This increase in scale raises proportionate difficulties for practitioners: foundation model training and inference lie at a unique interdisciplinary crossroad, combining open problems in algorithms, system design, and software engineering.In response to these challenges, diverse research directions have spawned promising works: (1) training and inference either at large scale or in resource-constrained scenarios (e.g., with higher network latency and lower bandwidth, in a collaborative manner across a fleet of contributed devices, or with a single GPU); (2) large-scale distributed training approaches, such as 3D parallelism and sharding; and (3) deep system optimizations, with custom languages such as TVM and Triton. These novel interdisciplinary research directions directly shape and impact the trajectory of research across machine learning.Accordingly, these emerging lines of research are increasingly relevant to machine learning researchers. Indeed, researchers are key stakeholders: on the one hand, researchers may contribute algorithmic insights and novel methods to improving training and inference of large models (e.g., recent award-winning papers at ICML and NeurIPS); on the other hand, novel research findings may be best demonstrated at scale --- which may require training models as efficiently as possible to make the best use of available resources.The goal of this workshop is to bring together interdisciplinary experts working on the emerging research questions and challenges associated with foundation model training and inference.This would be the $$\textbf{second}$$ installment of the ES-FoMo workshop at ICML. This year, we are bringing further focus on three trends observed in 2023: (1) the emergence of novel architectures, popularized by Mixtral (mixture-of-experts) and Mamba (state-space models); (2) efficient open implementations, such as vLLM and $$\texttt{gpt-fast}$$; (3) open questions on novel hardware and data tooling. We look forward to continuing to grow this community at ICML 2024.

Live content is unavailable. Log in and register to view live content