Timezone: »

Suboptimal Data Can Bottleneck Scaling
Jacob Buckman · Kshitij Gupta · Ethan Caballero · Rishabh Agarwal · Marc Bellemare

Deep learning has been shown to reliably improve in performance on supervised learning tasks when scaling up data, compute, and parameters. In this work, we argue that properly understanding the impact of scale requires a nuanced understanding of dataset composition. Towards this end, we design experiments in the domain of offline reinforcement learning to disentangle the effects of data quantity and quality. Our results comprehensively confirm that performance is bottlenecked by the quality of the data, even in the limit of parameters, compute, and dataset size. Furthermore, we show that the performance of offline reinforcement learning algorithms obeys reliable scaling laws in these settings, allowing performance-at-scale to be extrapolated from a smaller set of experiments.

Author Information

Jacob Buckman (Johns Hopkins University)
Kshitij Gupta (Mila)
Ethan Caballero (Mila)
Rishabh Agarwal (Google DeepMind)
Marc Bellemare (Google DeepMind)

More from the Same Authors