Poster
in
Workshop: Data-centric Machine Learning Research (DMLR): Datasets for Foundation Models
Weak-to-Strong Generalization Through the Data-Centric Lens
Changho Shin · John Cooper · Dyah Adila · Frederic Sala
The weak-to-strong generalization phenomenon is the driver for important machine learning applications including highly data-efficient learning and, most recently, performing superalignment. While decades of research have resulted in numerous algorithms that produce strong empirical performance, understanding what aspects of data enable weak-to-strong generalization has been understudied. We propose a simple data-centric mechanism that characterizes weak-to-strong generalization, the overlap density. Intuitively, generalization tracks the number of points that contain overlaps, i.e., both easy patterns (learnable by a weak model) and challenging patterns (only learnable by a stronger model), as with such points, weak predictions can be used to learn deeper relationships by stronger models. Theoretically, we provide a simple result showing that the generalization benefit is a function of the overlap density. Empirically, we validate the mechanism on a wide array of settings. Finally, we provide an algorithm to learn, among multiple sources of data, which to query when seeking to maximize generalization. On a benchmark dataset for weak-to-strong generalization, our approach provides a lift of 2.4 points when compared to randomly sampling data.