Skip to yearly menu bar Skip to main content


DoReMi: Optimizing Data Mixtures Speeds Up Language Model Pretraining

Sang Michael Xie · Hieu Pham · Xuanyi Dong · Nan Du · Hanxiao Liu · Yifeng Lu · Percy Liang · Quoc Le · Tengyu Ma · Adams Wei Yu

Abstract

Chat is not available.