Skip to yearly menu bar Skip to main content


Recycling the Web: A Method to Enhance Pre-training Data Quality and Quantity for Language Models

Thao Nguyen · Yang Li · Olga Golovneva · Luke Zettlemoyer · Sewoong Oh · Ludwig Schmidt · Xian Li

Abstract

Chat is not available.