Skip to yearly menu bar Skip to main content


Continual Pre-Training of Large Language Models: How to re-warm your model?

Kshitij Gupta · Benjamin Thérien · Adam Ibrahim · Mats Richter · Quentin Anthony · Eugene Belilovsky · Timothée Lesort · Irina Rish

Abstract

Video

Chat is not available.