Skip to yearly menu bar Skip to main content


DoReMi: Optimizing Data Mixtures Speeds Up Language Model Pretraining

Sang Michael Xie ⋅ Hieu Pham ⋅ Xuanyi Dong ⋅ Nan Du ⋅ Hanxiao Liu ⋅ Yifeng Lu ⋅ Percy Liang ⋅ Quoc Le ⋅ Tengyu Ma ⋅ Adams Wei Yu

Abstract

Chat is not available.