Timezone: »

Keynote by Mikhail Belkin: A Hard Look at Generalization and its Theories
Mikhail Belkin

Fri Jun 14 11:10 AM -- 11:40 AM (PDT) @

"A model with zero training error is overfit to the training data and will typically generalize poorly" goes statistical textbook wisdom. Yet in modern practice over-parametrized deep networks with near perfect (interpolating) fit on training data still show excellent test performance. This fact is difficult to reconcile with most modern theories of generalization that rely on bounding the difference between the empirical and expected error. Indeed, as we will discuss, bounds of that type cannot be expected to explain generalization of interpolating models. I will proceed to show how classical and modern models can be unified within a new "double descent" risk curve that extends the usual U-shaped bias-variance trade-off curve beyond the point of interpolation. This curve delimits the regime of applicability of classical bounds and the regime where new analyses are required. I will give examples of first theoretical analyses in that modern regime and discuss the (considerable) gaps in our knowledge. Finally I will briefly discuss some implications for optimization.

Bio: Mikhail Belkin is a Professor in the departments of Computer Science and Engineering and Statistics at the Ohio State University. He received a PhD in mathematics from the University of Chicago in 2003. His research focuses on understanding the fundamental structure in data, the principles of recovering these structures and their computational, mathematical and statistical properties. This understanding, in turn, leads to algorithms for dealing with real-world data. His work includes algorithms such as Laplacian Eigenmaps and Manifold Regularization based on ideas of classical differential geometry, which have been widely used for analyzing non-linear high-dimensional data. He has done work on spectral methods, Gaussian mixture models, kernel methods and applications. Recently his work has been focussed on understanding generalization and optimization in modern over-parametrized machine learning. Prof. Belkin is a recipient of an NSF Career Award and a number of best paper and other awards and has served on the editorial boards of the Journal of Machine Learning Research and IEEE PAMI.

Author Information

Mikhail Belkin (Ohio State University)

More from the Same Authors