Timezone: »
Deep learning has been shown to reliably improve in performance on supervised learning tasks when scaling up data, compute, and parameters. In this work, we argue that properly understanding the impact of scale requires a nuanced understanding of dataset composition. Towards this end, we design experiments in the domain of offline reinforcement learning to disentangle the effects of data quantity and quality. Our results comprehensively confirm that performance is bottlenecked by the quality of the data, even in the limit of parameters, compute, and dataset size. Furthermore, we show that the performance of offline reinforcement learning algorithms obeys reliable scaling laws in these settings, allowing performance-at-scale to be extrapolated from a smaller set of experiments.
Author Information
Jacob Buckman (Johns Hopkins University)
Kshitij Gupta (Mila)
Ethan Caballero (Mila)
Rishabh Agarwal (Google DeepMind)
Marc Bellemare (Google DeepMind)
More from the Same Authors
-
2021 : Value-Based Deep Reinforcement Learning Requires Explicit Regularization »
Aviral Kumar · Rishabh Agarwal · Aaron Courville · Tengyu Ma · George Tucker · Sergey Levine -
2021 : Control-Oriented Model-Based Reinforcement Learning with Implicit Differentiation »
Evgenii Nikishin · Romina Abachi · Rishabh Agarwal · Pierre-Luc Bacon -
2021 : Value-Based Deep Reinforcement Learning Requires Explicit Regularization »
Aviral Kumar · Rishabh Agarwal · Aaron Courville · Tengyu Ma · George Tucker · Sergey Levine -
2021 : Value-Based Deep Reinforcement Learning Requires Explicit Regularization »
Aviral Kumar · Rishabh Agarwal · Aaron Courville · Tengyu Ma · George Tucker · Sergey Levine -
2023 : Continual Pre-Training of Large Language Models: How to re-warm your model? »
Kshitij Gupta · Benjamin Thérien · Adam Ibrahim · Mats Richter · Quentin Anthony · Eugene Belilovsky · Timothée Lesort · Irina Rish -
2023 Poster: Bootstrapped Representations in Reinforcement Learning »
Charline Le Lan · Stephen Tu · Mark Rowland · Anna Harutyunyan · Rishabh Agarwal · Marc Bellemare · Will Dabney -
2023 Poster: The Statistical Benefits of Quantile Temporal-Difference Learning for Value Estimation »
Mark Rowland · Yunhao Tang · Clare Lyle · Remi Munos · Marc Bellemare · Will Dabney -
2023 Poster: Bigger, Better, Faster: Human-level Atari with human-level efficiency »
Max Schwarzer · Johan Obando Ceron · Aaron Courville · Marc Bellemare · Rishabh Agarwal · Pablo Samuel Castro -
2022 Poster: Distributional Hamilton-Jacobi-Bellman Equations for Continuous-Time Reinforcement Learning »
Harley Wiltzer · David Meger · Marc Bellemare -
2022 Spotlight: Distributional Hamilton-Jacobi-Bellman Equations for Continuous-Time Reinforcement Learning »
Harley Wiltzer · David Meger · Marc Bellemare -
2021 : Value-Based Deep Reinforcement Learning Requires Explicit Regularization »
Aviral Kumar · Rishabh Agarwal · Aaron Courville · Tengyu Ma · George Tucker · Sergey Levine -
2021 Social: RL Social »
Dibya Ghosh · Hager Radi · Derek Li · Alex Ayoub · Erfan Miahi · Rishabh Agarwal · Charline Le Lan · Abhishek Naik · John D. Martin · Shruti Mishra · Adrien Ali Taiga -
2021 Poster: Out-of-Distribution Generalization via Risk Extrapolation (REx) »
David Krueger · Ethan Caballero · Joern-Henrik Jacobsen · Amy Zhang · Jonathan Binas · Dinghuai Zhang · Remi Le Priol · Aaron Courville -
2021 Oral: Out-of-Distribution Generalization via Risk Extrapolation (REx) »
David Krueger · Ethan Caballero · Joern-Henrik Jacobsen · Amy Zhang · Jonathan Binas · Dinghuai Zhang · Remi Le Priol · Aaron Courville -
2020 Poster: Revisiting Fundamentals of Experience Replay »
William Fedus · Prajit Ramachandran · Rishabh Agarwal · Yoshua Bengio · Hugo Larochelle · Mark Rowland · Will Dabney -
2020 Poster: An Optimistic Perspective on Offline Deep Reinforcement Learning »
Rishabh Agarwal · Dale Schuurmans · Mohammad Norouzi -
2020 Poster: Representations for Stable Off-Policy Reinforcement Learning »
Dibya Ghosh · Marc Bellemare -
2019 Poster: Learning to Generalize from Sparse and Underspecified Rewards »
Rishabh Agarwal · Chen Liang · Dale Schuurmans · Mohammad Norouzi -
2019 Oral: Learning to Generalize from Sparse and Underspecified Rewards »
Rishabh Agarwal · Chen Liang · Dale Schuurmans · Mohammad Norouzi -
2019 Poster: Statistics and Samples in Distributional Reinforcement Learning »
Mark Rowland · Robert Dadashi · Saurabh Kumar · Remi Munos · Marc Bellemare · Will Dabney -
2019 Oral: Statistics and Samples in Distributional Reinforcement Learning »
Mark Rowland · Robert Dadashi · Saurabh Kumar · Remi Munos · Marc Bellemare · Will Dabney -
2019 Poster: The Value Function Polytope in Reinforcement Learning »
Robert Dadashi · Marc Bellemare · Adrien Ali Taiga · Nicolas Le Roux · Dale Schuurmans -
2019 Poster: DeepMDP: Learning Continuous Latent Space Models for Representation Learning »
Carles Gelada · Saurabh Kumar · Jacob Buckman · Ofir Nachum · Marc Bellemare -
2019 Oral: The Value Function Polytope in Reinforcement Learning »
Robert Dadashi · Marc Bellemare · Adrien Ali Taiga · Nicolas Le Roux · Dale Schuurmans -
2019 Oral: DeepMDP: Learning Continuous Latent Space Models for Representation Learning »
Carles Gelada · Saurabh Kumar · Jacob Buckman · Ofir Nachum · Marc Bellemare -
2018 Poster: Is Generator Conditioning Causally Related to GAN Performance? »
Augustus Odena · Jacob Buckman · Catherine Olsson · Tom B Brown · Christopher Olah · Colin Raffel · Ian Goodfellow -
2018 Oral: Is Generator Conditioning Causally Related to GAN Performance? »
Augustus Odena · Jacob Buckman · Catherine Olsson · Tom B Brown · Christopher Olah · Colin Raffel · Ian Goodfellow -
2017 : Panel Discussion »
Balaraman Ravindran · Chelsea Finn · Alessandro Lazaric · Katja Hofmann · Marc Bellemare -
2017 : Marc G. Bellemare: The role of density models in reinforcement learning »
Marc Bellemare -
2017 Poster: Count-Based Exploration with Neural Density Models »
Georg Ostrovski · Marc Bellemare · Aäron van den Oord · Remi Munos -
2017 Talk: Count-Based Exploration with Neural Density Models »
Georg Ostrovski · Marc Bellemare · Aäron van den Oord · Remi Munos -
2017 Poster: A Laplacian Framework for Option Discovery in Reinforcement Learning »
Marlos C. Machado · Marc Bellemare · Michael Bowling -
2017 Poster: A Distributional Perspective on Reinforcement Learning »
Marc Bellemare · Will Dabney · Remi Munos -
2017 Poster: Automated Curriculum Learning for Neural Networks »
Alex Graves · Marc Bellemare · Jacob Menick · Remi Munos · Koray Kavukcuoglu -
2017 Talk: A Laplacian Framework for Option Discovery in Reinforcement Learning »
Marlos C. Machado · Marc Bellemare · Michael Bowling -
2017 Talk: A Distributional Perspective on Reinforcement Learning »
Marc Bellemare · Will Dabney · Remi Munos -
2017 Talk: Automated Curriculum Learning for Neural Networks »
Alex Graves · Marc Bellemare · Jacob Menick · Remi Munos · Koray Kavukcuoglu