Skip to yearly menu bar Skip to main content


Recycling the Web: A Method to Enhance Pre-training Data Quality and Quantity for Language Models

Thao Nguyen ⋅ Yang Li ⋅ Olga Golovneva ⋅ Luke Zettlemoyer ⋅ Sewoong Oh ⋅ Ludwig Schmidt ⋅ Xian Li

Abstract

Chat is not available.