Timezone: »
Poster
Characterizing Implicit Bias in Terms of Optimization Geometry
Suriya Gunasekar · Jason Lee · Daniel Soudry · Nati Srebro
We study the bias of generic optimization methods, including Mirror Descent, Natural Gradient Descent and Steepest Descent with respect to different potentials and norms, when optimizing underdetermined linear models or separable linear classification problems. We ask the question of whether the global minimum (among the many possible global minima) reached by optimization can be characterized in terms of the potential or norm, and indecently of hyper-parameter choices, such as stepsize and momentum.
Author Information
Suriya Gunasekar (Toyota Technological Institute at Chicago)
Jason Lee (University of Southern California)
Daniel Soudry (Technion)
Nati Srebro (Toyota Technological Institute at Chicago)
Related Events (a corresponding poster, oral, or spotlight)
-
2018 Oral: Characterizing Implicit Bias in Terms of Optimization Geometry »
Thu. Jul 12th 11:50 AM -- 12:00 PM Room A9
More from the Same Authors
-
2021 : Inductive Bias of Multi-Channel Linear Convolutional Networks with Bounded Weight Norm »
Meena Jagadeesan · Ilya Razenshteyn · Suriya Gunasekar -
2023 : When is Agnostic Reinforcement Learning Statistically Tractable? »
Gene Li · Zeyu Jia · Alexander Rakhlin · Ayush Sekhari · Nati Srebro -
2023 : On the Still Unreasonable Effectiveness of Federated Averaging for Heterogeneous Distributed Learning »
Kumar Kshitij Patel · Margalit Glasgow · Lingxiao Wang · Nirmit Joshi · Nati Srebro -
2023 Poster: Federated Online and Bandit Convex Optimization »
Kumar Kshitij Patel · Lingxiao Wang · Aadirupa Saha · Nati Srebro -
2023 Poster: Gradient Descent Monotonically Decreases the Sharpness of Gradient Flow Solutions in Scalar Networks and Beyond »
Itai Kreisler · Mor Shpigel Nacson · Daniel Soudry · Yair Carmon -
2023 Poster: Continual Learning in Linear Classification on Separable Data »
Itay Evron · Edward Moroshko · Gon Buzaglo · Maroun Khriesh · Badea Marjieh · Nati Srebro · Daniel Soudry -
2022 Poster: Data Augmentation as Feature Manipulation »
Ruoqi Shen · Sebastien Bubeck · Suriya Gunasekar -
2022 Poster: Implicit Bias of the Step Size in Linear Diagonal Neural Networks »
Mor Shpigel Nacson · Kavya Ravichandran · Nati Srebro · Daniel Soudry -
2022 Spotlight: Implicit Bias of the Step Size in Linear Diagonal Neural Networks »
Mor Shpigel Nacson · Kavya Ravichandran · Nati Srebro · Daniel Soudry -
2022 Spotlight: Data Augmentation as Feature Manipulation »
Ruoqi Shen · Sebastien Bubeck · Suriya Gunasekar -
2021 : Function space view of Multi-Channel Linear Convolutional Networks with Bounded Weight Norm »
Suriya Gunasekar -
2021 Poster: Fast margin maximization via dual acceleration »
Ziwei Ji · Nati Srebro · Matus Telgarsky -
2021 Spotlight: Fast margin maximization via dual acceleration »
Ziwei Ji · Nati Srebro · Matus Telgarsky -
2021 Poster: Quantifying the Benefit of Using Differentiable Learning over Tangent Kernels »
Eran Malach · Pritish Kamath · Emmanuel Abbe · Nati Srebro -
2021 Spotlight: Quantifying the Benefit of Using Differentiable Learning over Tangent Kernels »
Eran Malach · Pritish Kamath · Emmanuel Abbe · Nati Srebro -
2021 Poster: Dropout: Explicit Forms and Capacity Control »
Raman Arora · Peter Bartlett · Poorya Mianjy · Nati Srebro -
2021 Spotlight: Dropout: Explicit Forms and Capacity Control »
Raman Arora · Peter Bartlett · Poorya Mianjy · Nati Srebro -
2021 Poster: On the Implicit Bias of Initialization Shape: Beyond Infinitesimal Mirror Descent »
Shahar Azulay · Edward Moroshko · Mor Shpigel Nacson · Blake Woodworth · Nati Srebro · Amir Globerson · Daniel Soudry -
2021 Oral: On the Implicit Bias of Initialization Shape: Beyond Infinitesimal Mirror Descent »
Shahar Azulay · Edward Moroshko · Mor Shpigel Nacson · Blake Woodworth · Nati Srebro · Amir Globerson · Daniel Soudry -
2021 Poster: Accurate Post Training Quantization With Small Calibration Sets »
Itay Hubara · Yury Nahshan · Yair Hanani · Ron Banner · Daniel Soudry -
2021 Spotlight: Accurate Post Training Quantization With Small Calibration Sets »
Itay Hubara · Yury Nahshan · Yair Hanani · Ron Banner · Daniel Soudry -
2020 Poster: Efficiently Learning Adversarially Robust Halfspaces with Noise »
Omar Montasser · Surbhi Goel · Ilias Diakonikolas · Nati Srebro -
2020 Poster: Beyond Signal Propagation: Is Feature Diversity Necessary in Deep Neural Network Initialization? »
Yaniv Blumenfeld · Dar Gilboa · Daniel Soudry -
2020 Poster: Is Local SGD Better than Minibatch SGD? »
Blake Woodworth · Kumar Kshitij Patel · Sebastian Stich · Zhen Dai · Brian Bullins · Brendan McMahan · Ohad Shamir · Nati Srebro -
2020 Poster: Fair Learning with Private Demographic Data »
Hussein Mozannar · Mesrob Ohannessian · Nati Srebro -
2019 : Nati Srebro: Optimization’s Untold Gift to Learning: Implicit Regularization »
Nati Srebro -
2019 : Poster discussion »
Roman Novak · Maxime Gabella · Frederic Dreyer · Siavash Golkar · Anh Tong · Irina Higgins · Mirco Milletari · Joe Antognini · Sebastian Goldt · Adín Ramírez Rivera · Roberto Bondesan · Ryo Karakida · Remi Tachet des Combes · Michael Mahoney · Nicholas Walker · Stanislav Fort · Samuel Smith · Rohan Ghosh · Aristide Baratin · Diego Granziol · Stephen Roberts · Dmitry Vetrov · Andrew Wilson · César Laurent · Valentin Thomas · Simon Lacoste-Julien · Dar Gilboa · Daniel Soudry · Anupam Gupta · Anirudh Goyal · Yoshua Bengio · Erich Elsen · Soham De · Stanislaw Jastrzebski · Charles H Martin · Samira Shabanian · Aaron Courville · Shorato Akaho · Lenka Zdeborova · Ethan Dyer · Maurice Weiler · Pim de Haan · Taco Cohen · Max Welling · Ping Luo · zhanglin peng · Nasim Rahaman · Loic Matthey · Danilo J. Rezende · Jaesik Choi · Kyle Cranmer · Lechao Xiao · Jaehoon Lee · Yasaman Bahri · Jeffrey Pennington · Greg Yang · Jiri Hron · Jascha Sohl-Dickstein · Guy Gur-Ari -
2019 : Panel Discussion (Nati Srebro, Dan Roy, Chelsea Finn, Mikhail Belkin, Aleksander Mądry, Jason Lee) »
Nati Srebro · Daniel Roy · Chelsea Finn · Mikhail Belkin · Aleksander Madry · Jason Lee -
2019 : Keynote by Jason Lee: On the Foundations of Deep Learning: SGD, Overparametrization, and Generalization »
Jason Lee -
2019 Workshop: Understanding and Improving Generalization in Deep Learning »
Dilip Krishnan · Hossein Mobahi · Behnam Neyshabur · Behnam Neyshabur · Peter Bartlett · Dawn Song · Nati Srebro -
2019 Poster: Semi-Cyclic Stochastic Gradient Descent »
Hubert Eichner · Tomer Koren · Brendan McMahan · Nati Srebro · Kunal Talwar -
2019 Oral: Semi-Cyclic Stochastic Gradient Descent »
Hubert Eichner · Tomer Koren · Brendan McMahan · Nati Srebro · Kunal Talwar -
2019 Poster: Training Well-Generalizing Classifiers for Fairness Metrics and Other Data-Dependent Constraints »
Andrew Cotter · Maya Gupta · Heinrich Jiang · Nati Srebro · Karthik Sridharan · Serena Wang · Blake Woodworth · Seungil You -
2019 Poster: Gradient Descent Finds Global Minima of Deep Neural Networks »
Simon Du · Jason Lee · Haochuan Li · Liwei Wang · Xiyu Zhai -
2019 Poster: Lexicographic and Depth-Sensitive Margins in Homogeneous and Non-Homogeneous Deep Models »
Mor Shpigel Nacson · Suriya Gunasekar · Jason Lee · Nati Srebro · Daniel Soudry -
2019 Oral: Training Well-Generalizing Classifiers for Fairness Metrics and Other Data-Dependent Constraints »
Andrew Cotter · Maya Gupta · Heinrich Jiang · Nati Srebro · Karthik Sridharan · Serena Wang · Blake Woodworth · Seungil You -
2019 Oral: Gradient Descent Finds Global Minima of Deep Neural Networks »
Simon Du · Jason Lee · Haochuan Li · Liwei Wang · Xiyu Zhai -
2019 Oral: Lexicographic and Depth-Sensitive Margins in Homogeneous and Non-Homogeneous Deep Models »
Mor Shpigel Nacson · Suriya Gunasekar · Jason Lee · Nati Srebro · Daniel Soudry -
2018 Poster: On the Power of Over-parametrization in Neural Networks with Quadratic Activation »
Simon Du · Jason Lee -
2018 Poster: Gradient Primal-Dual Algorithm Converges to Second-Order Stationary Solution for Nonconvex Distributed Optimization Over Networks »
Mingyi Hong · Meisam Razaviyayn · Jason Lee -
2018 Oral: On the Power of Over-parametrization in Neural Networks with Quadratic Activation »
Simon Du · Jason Lee -
2018 Oral: Gradient Primal-Dual Algorithm Converges to Second-Order Stationary Solution for Nonconvex Distributed Optimization Over Networks »
Mingyi Hong · Meisam Razaviyayn · Jason Lee -
2018 Poster: Gradient Descent Learns One-hidden-layer CNN: Don't be Afraid of Spurious Local Minima »
Simon Du · Jason Lee · Yuandong Tian · Aarti Singh · Barnabás Póczos -
2018 Oral: Gradient Descent Learns One-hidden-layer CNN: Don't be Afraid of Spurious Local Minima »
Simon Du · Jason Lee · Yuandong Tian · Aarti Singh · Barnabás Póczos -
2017 Poster: Efficient Distributed Learning with Sparsity »
Jialei Wang · Mladen Kolar · Nati Srebro · Tong Zhang -
2017 Talk: Efficient Distributed Learning with Sparsity »
Jialei Wang · Mladen Kolar · Nati Srebro · Tong Zhang -
2017 Poster: Communication-efficient Algorithms for Distributed Stochastic Principal Component Analysis »
Dan Garber · Ohad Shamir · Nati Srebro -
2017 Talk: Communication-efficient Algorithms for Distributed Stochastic Principal Component Analysis »
Dan Garber · Ohad Shamir · Nati Srebro