Skip to yearly menu bar Skip to main content


Continual Pre-Training of Large Language Models: How to re-warm your model?

Kshitij Gupta ⋅ Benjamin Thérien ⋅ Adam Ibrahim ⋅ Mats Richter ⋅ Quentin Anthony ⋅ Eugene Belilovsky ⋅ Timothée Lesort ⋅ Irina Rish

Abstract

Video

Chat is not available.