Skip to yearly menu bar Skip to main content


SuperShaper: A Pre-Training Approach for Discovering Efficient Transformer Shapes

Vinod Ganesan · Gowtham Ramesh · Pratyush Kumar · Raj Dabre

Abstract

Video

Chat is not available.