Skip to yearly menu bar Skip to main content


Do Larger Language Models Imply Better Generalization? A Pretraining Scaling Law for Implicit Reasoning

Xinyi Wang · Shawn Tan · Mingyu Jin · William Wang · Rameswar Panda · Yikang Shen

Abstract

Chat is not available.