Timezone: »

A Closer Look at Memorization in Deep Networks
David Krueger · Yoshua Bengio · Stanislaw Jastrzebski · Maxinder S. Kanwal · Nicolas Ballas · Asja Fischer · Emmanuel Bengio · Devansh Arpit · Tegan Maharaj · Aaron Courville · Simon Lacoste-Julien

Mon Aug 07 10:30 PM -- 10:48 PM (PDT) @ Darling Harbour Theatre

We examine the role of memorization in deep learning, drawing connections to capacity, generalization, and adversarial robustness. While deep networks are capable of memorizing noise data, our results suggest that they tend to prioritize learning simple patterns first. In our experiments, we expose qualitative differences in gradient-based optimization of deep neural networks (DNNs) on noise vs.~real data. We also demonstrate that for appropriately tuned explicit regularization (e.g.,~dropout) we can degrade DNN training performance on noise datasets without compromising generalization on real data. Our analysis suggests that the notions of effective capacity which are dataset independent are unlikely to explain the generalization performance of deep networks when trained with gradient based methods because training data itself plays an important role in determining the degree of memorization.

Author Information

David Krueger (MILA)
Yoshua Bengio (U. Montreal)

Yoshua Bengio is recognized as one of the world’s leading experts in artificial intelligence and a pioneer in deep learning. Since 1993, he has been a professor in the Department of Computer Science and Operational Research at the Université de Montréal. He is the founder and scientific director of Mila, the Quebec Institute of Artificial Intelligence, the world’s largest university-based research group in deep learning. He is a member of the NeurIPS board and co-founder and general chair for the ICLR conference, as well as program director of the CIFAR program on Learning in Machines and Brains and is Fellow of the same institution. In 2018, Yoshua Bengio ranked as the computer scientist with the most new citations, worldwide, thanks to his many publications. In 2019, he received the ACM A.M. Turing Award, “the Nobel Prize of Computing”, jointly with Geoffrey Hinton and Yann LeCun for conceptual and engineering breakthroughs that have made deep neural networks a critical component of computing. In 2020 he was nominated Fellow of the Royal Society of London.

Stanislaw Jastrzebski (Jagiellonian University)
Maxinder S. Kanwal (UC Berkeley)
Nicolas Ballas (Université de Montréal)
Asja Fischer (Computer Science Department, University of Bonn)
Emmanuel Bengio (McGill University)
Devansh Arpit
Tegan Maharaj
Aaron Courville (University of Montreal)
Simon Lacoste-Julien (University of Montreal)

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors