Timezone: »

Bigger, Better, Faster: Human-level Atari with human-level efficiency
Max Schwarzer · Johan Obando Ceron · Aaron Courville · Marc Bellemare · Rishabh Agarwal · Pablo Samuel Castro

Tue Jul 25 05:00 PM -- 06:30 PM (PDT) @ Exhibit Hall 1 #229

We introduce a value-based RL agent, which we call BBF, that achieves super-human performance in the Atari 100K benchmark. BBF relies on scaling the neural networks used for value estimation, as well as a number of other design choices that enable this scaling in a sample-efficient manner. We conduct extensive analyses of these design choices and provide insights for future work. We end with a discussion about updating the goalposts for sample-efficient RL research on the ALE. We make our code and data publicly available at https://github.com/google-research/google-research/tree/master/biggerbetterfaster.

Author Information

Max Schwarzer (Mila, Apple MLR)
Johan Obando Ceron (Mila / Université de Montréal)
Aaron Courville (University of Montreal)
Marc Bellemare (Google DeepMind)
Rishabh Agarwal (Google DeepMind)
Pablo Samuel Castro (Google DeepMind)

Pablo was born and raised in Quito, Ecuador, and moved to Montreal after high school to study at McGill. He stayed in Montreal for the next 10 years, finished his bachelors, worked at a flight simulator company, and then eventually obtained his masters and PhD at McGill, focusing on Reinforcement Learning. After his PhD Pablo did a 10-month postdoc in Paris before moving to Pittsburgh to join Google. He has worked at Google for almost 6 years, and is currently a research Software Engineer in Google Brain in Montreal, focusing on fundamental Reinforcement Learning research, as well as Machine Learning and Music. Aside from his interest in coding/AI/math, Pablo is an active musician (https://www.psctrio.com), loves running (5 marathons so far, including Boston!), and discussing politics and activism.

More from the Same Authors