Skip to yearly menu bar Skip to main content


Do Larger Language Models Imply Better Generalization? A Pretraining Scaling Law for Implicit Reasoning

Xinyi Wang ⋅ Shawn Tan ⋅ Mingyu Jin ⋅ William Wang ⋅ Rameswar Panda ⋅ Yikang Shen

Abstract

Chat is not available.