Skip to yearly menu bar Skip to main content


Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations

Alexander Hägele ⋅ Elie Bakouch ⋅ Atli Kosson ⋅ Loubna Ben allal ⋅ Leandro Von Werra ⋅ Martin Jaggi

Abstract

Chat is not available.