Timezone: »
Algorithmic approaches endow deep learning systems with implicit bias that helps them generalize even in over-parametrized settings. In this paper, we focus on understanding such a bias induced in learning through dropout, a popular technique to avoid overfitting in deep learning. For single hidden-layer linear neural networks, we show that dropout tends to make the norm of incoming/outgoing weight vectors of all the hidden nodes equal. In addition, we provide a complete characterization of the optimization landscape induced by dropout.
Author Information
Poorya Mianjy (Johns Hopkins University)
Raman Arora (Johns Hopkins University)

Raman Arora received his M.S. and Ph.D. degrees in Electrical and Computer Engineering from the University of Wisconsin-Madison in 2005 and 2009, respectively. From 2009-2011, he was a Postdoctoral Research Associate at the University of Washington in Seattle and a Visiting Researcher at Microsoft Research Redmond. Since 2011, he has been with Toyota Technological Institute at Chicago (TTIC). His research interests include machine learning, speech recognition and statistical signal processing.
Rene Vidal (Johns Hopkins University)
Related Events (a corresponding poster, oral, or spotlight)
-
2018 Oral: On the Implicit Bias of Dropout »
Fri. Jul 13th 03:30 -- 03:40 PM Room K11
More from the Same Authors
-
2021 Poster: Robust Learning for Data Poisoning Attacks »
Yunjuan Wang · Poorya Mianjy · Raman Arora -
2021 Spotlight: Robust Learning for Data Poisoning Attacks »
Yunjuan Wang · Poorya Mianjy · Raman Arora -
2021 Poster: Dropout: Explicit Forms and Capacity Control »
Raman Arora · Peter Bartlett · Poorya Mianjy · Nati Srebro -
2021 Spotlight: Dropout: Explicit Forms and Capacity Control »
Raman Arora · Peter Bartlett · Poorya Mianjy · Nati Srebro -
2020 Poster: FetchSGD: Communication-Efficient Federated Learning with Sketching »
Daniel Rothchild · Ashwinee Panda · Enayat Ullah · Nikita Ivkin · Ion Stoica · Vladimir Braverman · Joseph E Gonzalez · Raman Arora -
2019 Poster: On Dropout and Nuclear Norm Regularization »
Poorya Mianjy · Raman Arora -
2019 Oral: On Dropout and Nuclear Norm Regularization »
Poorya Mianjy · Raman Arora -
2018 Poster: Theoretical Analysis of Sparse Subspace Clustering with Missing Entries »
Manolis Tsakiris · Rene Vidal -
2018 Oral: Theoretical Analysis of Sparse Subspace Clustering with Missing Entries »
Manolis Tsakiris · Rene Vidal -
2018 Poster: ADMM and Accelerated ADMM as Continuous Dynamical Systems »
Guilherme Franca · Daniel Robinson · Rene Vidal -
2018 Poster: Streaming Principal Component Analysis in Noisy Setting »
Teodor Vanislavov Marinov · Poorya Mianjy · Raman Arora -
2018 Poster: Stochastic PCA with $\ell_2$ and $\ell_1$ Regularization »
Poorya Mianjy · Raman Arora -
2018 Oral: ADMM and Accelerated ADMM as Continuous Dynamical Systems »
Guilherme Franca · Daniel Robinson · Rene Vidal -
2018 Oral: Streaming Principal Component Analysis in Noisy Setting »
Teodor Vanislavov Marinov · Poorya Mianjy · Raman Arora -
2018 Oral: Stochastic PCA with $\ell_2$ and $\ell_1$ Regularization »
Poorya Mianjy · Raman Arora -
2017 Poster: Hyperplane Clustering Via Dual Principal Component Pursuit »
Manolis Tsakiris · Rene Vidal -
2017 Talk: Hyperplane Clustering Via Dual Principal Component Pursuit »
Manolis Tsakiris · Rene Vidal