Workshop: Beyond first order methods in machine learning systems

Talk by Rachel Ward - Weighted Optimization: better generalization by smoother interpolation

Rachel Ward


We provide a rigorous analysis of how implicit bias towards smooth interpolations leads to low generalization error in the overparameterized setting. We provide the first case study of this connection through a random Fourier series model and weighted least squares. We then argue through this model and numerical experiments that normalization methods in deep learning such as weight normalization improve generalization in overparameterized neural networks by implicitly encouraging smooth interpolants. This is work with Yuege (Gail) Xie, Holger Rauhut, and Hung-Hsu Chou.

Chat is not available.