Timezone: »
Algorithmic approaches endow deep learning systems with implicit bias that helps them generalize even in over-parametrized settings. In this paper, we focus on understanding such a bias induced in learning through dropout, a popular technique to avoid overfitting in deep learning. For shallow linear neural networks, we show that dropout tends to make the norm of incoming/outgoing weight vectors of all the hidden nodes equal. We completely characterize the optimization landscape of single hidden-layer linear networks with dropout.
Author Information
Poorya Mianjy (Johns Hopkins University)
Raman Arora (Johns Hopkins University)

Raman Arora received his M.S. and Ph.D. degrees in Electrical and Computer Engineering from the University of Wisconsin-Madison in 2005 and 2009, respectively. From 2009-2011, he was a Postdoctoral Research Associate at the University of Washington in Seattle and a Visiting Researcher at Microsoft Research Redmond. Since 2011, he has been with Toyota Technological Institute at Chicago (TTIC). His research interests include machine learning, speech recognition and statistical signal processing.
Rene Vidal (Johns Hopkins University)
Related Events (a corresponding poster, oral, or spotlight)
-
2018 Poster: On the Implicit Bias of Dropout »
Fri. Jul 13th 04:15 -- 07:00 PM Room Hall B #69
More from the Same Authors
-
2023 Poster: Faster Rates of Convergence to Stationary Points in Differentially Private Optimization »
Raman Arora · Raef Bassily · Tomás González · Cristobal Guzman · Michael Menart · Enayat Ullah -
2023 Poster: From Adaptive Query Release to Machine Unlearning »
Enayat Ullah · Raman Arora -
2021 Poster: Robust Learning for Data Poisoning Attacks »
Yunjuan Wang · Poorya Mianjy · Raman Arora -
2021 Spotlight: Robust Learning for Data Poisoning Attacks »
Yunjuan Wang · Poorya Mianjy · Raman Arora -
2021 Poster: Dropout: Explicit Forms and Capacity Control »
Raman Arora · Peter Bartlett · Poorya Mianjy · Nati Srebro -
2021 Spotlight: Dropout: Explicit Forms and Capacity Control »
Raman Arora · Peter Bartlett · Poorya Mianjy · Nati Srebro -
2020 Poster: FetchSGD: Communication-Efficient Federated Learning with Sketching »
Daniel Rothchild · Ashwinee Panda · Enayat Ullah · Nikita Ivkin · Ion Stoica · Vladimir Braverman · Joseph E Gonzalez · Raman Arora -
2019 Poster: On Dropout and Nuclear Norm Regularization »
Poorya Mianjy · Raman Arora -
2019 Oral: On Dropout and Nuclear Norm Regularization »
Poorya Mianjy · Raman Arora -
2018 Poster: Theoretical Analysis of Sparse Subspace Clustering with Missing Entries »
Manolis Tsakiris · Rene Vidal -
2018 Oral: Theoretical Analysis of Sparse Subspace Clustering with Missing Entries »
Manolis Tsakiris · Rene Vidal -
2018 Poster: ADMM and Accelerated ADMM as Continuous Dynamical Systems »
Guilherme Franca · Daniel Robinson · Rene Vidal -
2018 Poster: Streaming Principal Component Analysis in Noisy Setting »
Teodor Vanislavov Marinov · Poorya Mianjy · Raman Arora -
2018 Poster: Stochastic PCA with $\ell_2$ and $\ell_1$ Regularization »
Poorya Mianjy · Raman Arora -
2018 Oral: ADMM and Accelerated ADMM as Continuous Dynamical Systems »
Guilherme Franca · Daniel Robinson · Rene Vidal -
2018 Oral: Streaming Principal Component Analysis in Noisy Setting »
Teodor Vanislavov Marinov · Poorya Mianjy · Raman Arora -
2018 Oral: Stochastic PCA with $\ell_2$ and $\ell_1$ Regularization »
Poorya Mianjy · Raman Arora -
2017 Poster: Hyperplane Clustering Via Dual Principal Component Pursuit »
Manolis Tsakiris · Rene Vidal -
2017 Talk: Hyperplane Clustering Via Dual Principal Component Pursuit »
Manolis Tsakiris · Rene Vidal