Skip to yearly menu bar Skip to main content


Poster

GradPower: Powering Gradients for Faster Language Model Pre-Training

Jinbo Wang ⋅ Mingze Wang ⋅ Jiaqi Zhang ⋅ Peng Pei ⋅ Wei Wang ⋅ Xunliang Cai ⋅ Weinan E ⋅ Lei Wu

Abstract

Log in and register to view live content