Invited Talk
in
Workshop: 2nd Workshop on Advancing Neural Network Training : Computational Efficiency, Scalability, and Resource Optimization (WANT@ICML 2024)
Making device-agnostic ML training and inference easy at scale
Zach Mueller
Using Hugging Face Accelerate as a case study, we will discuss an approach to creating a framework aimed at decentralizing and lowering the barrier-to-entry on machine learning model training given any hardware and/or accelerator configuration (FSDP, DeepSpeed, Multi/Single GPU, MPS/XLA/CUDA, etc). Given such a framework needs to be as robust as possible, this incurs a number of challenges to ensure that the framework can have approachable code, an intuitive API, and any errors related to these configurations can attempt to be as clear as possible.
In this talk, we will cover why Accelerate is designed the way it is and how certain API configurations were utilized such as commonality in configurations ideally through config files, having a zero-magic-code policy, and ensuring that there is as minimal code-intrusion as possible when utilizing the framework. We will also discuss how building a open-source forward framework helps the community flourish by easily hacking, adding, and fixing different aspects of the codebase, and in many cases enabling users to utilize new backends, kernels, and more on day 0.