Skip to yearly menu bar Skip to main content


Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations

Alexander Hägele · Elie Bakouch · Atli Kosson · Loubna Ben allal · Leandro Von Werra · Martin Jaggi

Abstract

Chat is not available.