Skip to yearly menu bar Skip to main content


Poster
in
Workshop: ES-FoMo II: 2nd Workshop on Efficient Systems for Foundation Models

OpenELM: An Efficient Language Model Family with Open Training and Inference Framework

Sachin Mehta · Mohammad Sekhavat · Qingqing Cao · Maxwell Horton · Yanzi Jin · Chenfan Sun · Seyed Iman Mirzadeh · Mahyar Najibi · Dmitry Belenko · Peter Zatloukal · Mohammad Rastegari


Abstract:

The reproducibility and transparency of large language models are crucial for advancing open research, ensuring the trustworthiness of results, and enabling investigations into data and model biases, as well as potential risks. To this end, we release OpenELM, a state-of-the-art open language model. OpenELM uses a layer-wise scaling strategy to efficiently allocate parameters within each layer of the model, leading to enhanced accuracy. For example, with a budget of around one billion parameters, OpenELM exhibits a 2.36% improvement in accuracy compared to OLMo while requiring 2X fewer pre-training tokens. Our source code along with pre-trained model weights and training recipes is available at https://github.com/apple/corenet. OpenELM HuggingFace models can be found at: https://huggingface.co/apple/OpenELM.

Chat is not available.