Timezone: »
The classical bias-variance trade-off predicts that bias decreases and variance increase with model complexity, leading to a U-shaped risk curve. Recent work calls this into question for neural networks and other over-parameterized models, for which it is often observed that larger models generalize better. We provide a simple explanation of this by measuring the bias and variance of neural networks: while the bias is {\em monotonically decreasing} as in the classical theory, the variance is {\em unimodal} or bell-shaped: it increases then decreases with the width of the network. We vary the network architecture, loss function, and choice of dataset and confirm that variance unimodality occurs robustly for all models we considered. The risk curve is the sum of the bias and variance curves and displays different qualitative shapes depending on the relative scale of bias and variance, with the double descent in the recent literature as a special case. We corroborate these empirical results with a theoretical analysis of two-layer linear networks with random first layer. Finally, evaluation on out-of-distribution data shows that most of the drop in accuracy comes from increased bias while variance increases by a relatively small amount. Moreover, we find that deeper models decrease bias and increase variance for both in-distribution and out-of-distribution data.
Author Information
Zitong Yang (University of California, Berkeley)
Yaodong Yu (University of California, Berkeley)
Chong You (University of California, Berkeley)
Jacob Steinhardt (UC Berkeley)
Yi Ma (UC Berkeley)
More from the Same Authors
-
2022 : Robust Calibration with Multi-domain Temperature Scaling »
Yaodong Yu · Stephen Bates · Yi Ma · Michael Jordan -
2022 : What You See is What You Get: Distributional Generalization for Algorithm Design in Deep Learning »
Bogdan Kulynych · Yao-Yuan Yang · Yaodong Yu · Jarosław Błasiok · Preetum Nakkiran -
2023 Poster: Understanding the Complexity Gains of Single-Task RL with a Curriculum »
Qiyang Li · Yuexiang Zhai · Yi Ma · Sergey Levine -
2023 Poster: Federated Conformal Predictors for Distributed Uncertainty Quantification »
Charles Lu · Yaodong Yu · Sai Karimireddy · Michael Jordan · Ramesh Raskar -
2022 : Distribution Shift Through the Lens of Explanations »
Jacob Steinhardt -
2022 Poster: Scaling Out-of-Distribution Detection for Real-World Settings »
Dan Hendrycks · Steven Basart · Mantas Mazeika · Andy Zou · joseph kwon · Mohammadreza Mostajabi · Jacob Steinhardt · Dawn Song -
2022 Poster: Predicting Out-of-Distribution Error with the Projection Norm »
Yaodong Yu · Zitong Yang · Alexander Wei · Yi Ma · Jacob Steinhardt -
2022 Poster: More Than a Toy: Random Matrix Models Predict How Real-World Neural Representations Generalize »
Alexander Wei · Wei Hu · Jacob Steinhardt -
2022 Spotlight: Scaling Out-of-Distribution Detection for Real-World Settings »
Dan Hendrycks · Steven Basart · Mantas Mazeika · Andy Zou · joseph kwon · Mohammadreza Mostajabi · Jacob Steinhardt · Dawn Song -
2022 Spotlight: Predicting Out-of-Distribution Error with the Projection Norm »
Yaodong Yu · Zitong Yang · Alexander Wei · Yi Ma · Jacob Steinhardt -
2022 Spotlight: More Than a Toy: Random Matrix Models Predict How Real-World Neural Representations Generalize »
Alexander Wei · Wei Hu · Jacob Steinhardt -
2022 Poster: Describing Differences between Text Distributions with Natural Language »
Ruiqi Zhong · Charlie Snell · Dan Klein · Jacob Steinhardt -
2022 Poster: Online Nonsubmodular Minimization with Delayed Costs: From Full Information to Bandit Feedback »
Tianyi Lin · Aldo Pacchiano · Yaodong Yu · Michael Jordan -
2022 Spotlight: Describing Differences between Text Distributions with Natural Language »
Ruiqi Zhong · Charlie Snell · Dan Klein · Jacob Steinhardt -
2022 Spotlight: Online Nonsubmodular Minimization with Delayed Costs: From Full Information to Bandit Feedback »
Tianyi Lin · Aldo Pacchiano · Yaodong Yu · Michael Jordan -
2021 Poster: Exact Gap between Generalization Error and Uniform Convergence in Random Feature Models »
Zitong Yang · Yu Bai · Song Mei -
2021 Spotlight: Exact Gap between Generalization Error and Uniform Convergence in Random Feature Models »
Zitong Yang · Yu Bai · Song Mei -
2020 Poster: Deep Isometric Learning for Visual Recognition »
Haozhi Qi · Chong You · Xiaolong Wang · Yi Ma · Jitendra Malik