Timezone: »
Poster
Bounding the Width of Neural Networks via Coupled Initialization - A Worst Case Analysis
Alexander Munteanu · Simon Omlor · Zhao Song · David Woodruff
A common method in training neural networks is to initialize all the weights to be independent Gaussian vectors. We observe that by instead initializing the weights into independent pairs, where each pair consists of two identical Gaussian vectors, we can significantly improve the convergence analysis. While a similar technique has been studied for random inputs [Daniely, NeurIPS 2020], it has not been analyzed with arbitrary inputs. Using this technique, we show how to significantly reduce the number of neurons required for two-layer ReLU networks, both in the under-parameterized setting with logistic loss, from roughly $\gamma^{-8}$ [Ji and Telgarsky, ICLR 2020] to $\gamma^{-2}$, where $\gamma$ denotes the separation margin with a Neural Tangent Kernel, as well as in the over-parameterized setting with squared loss, from roughly $n^4$ [Song and Yang, 2019] to $n^2$, implicitly also improving the recent running time bound of [Brand, Peng, Song and Weinstein, ITCS 2021]. For the under-parameterized setting we also prove new lower bounds that improve upon prior work, and that under certain assumptions, are best possible.
Author Information
Alexander Munteanu (TU Dortmund)
Simon Omlor (TU Dortmund)
Zhao Song (Adobe Research)
David Woodruff (Carnegie Mellon University)
Related Events (a corresponding poster, oral, or spotlight)
-
2022 Spotlight: Bounding the Width of Neural Networks via Coupled Initialization - A Worst Case Analysis »
Wed. Jul 20th 09:00 -- 09:05 PM Room Ballroom 3 & 4
More from the Same Authors
-
2022 : Single, Practical and Fast Dynamic Truncation Kernel Multiplication »
Lianke Qin · Somdeb Sarkhel · Zhao Song · Danyang Zhuo -
2022 Poster: Sketching Algorithms and Lower Bounds for Ridge Regression »
Praneeth Kacham · David Woodruff -
2022 Poster: Learning Augmented Binary Search Trees »
Honghao Lin · Tian Luo · David Woodruff -
2022 Poster: Leverage Score Sampling for Tensor Product Matrices in Input Sparsity Time »
David Woodruff · Amir Zandieh -
2022 Spotlight: Sketching Algorithms and Lower Bounds for Ridge Regression »
Praneeth Kacham · David Woodruff -
2022 Spotlight: Learning Augmented Binary Search Trees »
Honghao Lin · Tian Luo · David Woodruff -
2022 Spotlight: Leverage Score Sampling for Tensor Product Matrices in Input Sparsity Time »
David Woodruff · Amir Zandieh -
2022 Poster: Perfectly Balanced: Improving Transfer and Robustness of Supervised Contrastive Learning »
Mayee Chen · Daniel Y Fu · Avanika Narayan · Michael Zhang · Zhao Song · Kayvon Fatahalian · Christopher Re -
2022 Spotlight: Perfectly Balanced: Improving Transfer and Robustness of Supervised Contrastive Learning »
Mayee Chen · Daniel Y Fu · Avanika Narayan · Michael Zhang · Zhao Song · Kayvon Fatahalian · Christopher Re -
2022 Poster: One-Pass Algorithms for MAP Inference of Nonsymmetric Determinantal Point Processes »
Aravind Reddy · Ryan A. Rossi · Zhao Song · Anup Rao · Tung Mai · Nedim Lipka · Gang Wu · Eunyee Koh · Nesreen K Ahmed -
2022 Spotlight: One-Pass Algorithms for MAP Inference of Nonsymmetric Determinantal Point Processes »
Aravind Reddy · Ryan A. Rossi · Zhao Song · Anup Rao · Tung Mai · Nedim Lipka · Gang Wu · Eunyee Koh · Nesreen K Ahmed -
2021 Poster: Single Pass Entrywise-Transformed Low Rank Approximation »
Yifei Jiang · Yi Li · Yiming Sun · Jiaxin Wang · David Woodruff -
2021 Poster: Fast Sketching of Polynomial Kernels of Polynomial Degree »
Zhao Song · David Woodruff · Zheng Yu · Lichen Zhang -
2021 Poster: Dimensionality Reduction for the Sum-of-Distances Metric »
Zhili Feng · Praneeth Kacham · David Woodruff -
2021 Poster: Streaming and Distributed Algorithms for Robust Column Subset Selection »
Shuli Jiang · Dongyu Li · Irene Mengze Li · Arvind Mahankali · David Woodruff -
2021 Spotlight: Streaming and Distributed Algorithms for Robust Column Subset Selection »
Shuli Jiang · Dongyu Li · Irene Mengze Li · Arvind Mahankali · David Woodruff -
2021 Spotlight: Fast Sketching of Polynomial Kernels of Polynomial Degree »
Zhao Song · David Woodruff · Zheng Yu · Lichen Zhang -
2021 Spotlight: Single Pass Entrywise-Transformed Low Rank Approximation »
Yifei Jiang · Yi Li · Yiming Sun · Jiaxin Wang · David Woodruff -
2021 Oral: Dimensionality Reduction for the Sum-of-Distances Metric »
Zhili Feng · Praneeth Kacham · David Woodruff -
2021 Poster: In-Database Regression in Input Sparsity Time »
Rajesh Jayaram · Alireza Samadian · David Woodruff · Peng Ye -
2021 Poster: Oblivious Sketching for Logistic Regression »
Alexander Munteanu · Simon Omlor · David Woodruff -
2021 Spotlight: Oblivious Sketching for Logistic Regression »
Alexander Munteanu · Simon Omlor · David Woodruff -
2021 Spotlight: In-Database Regression in Input Sparsity Time »
Rajesh Jayaram · Alireza Samadian · David Woodruff · Peng Ye -
2020 Poster: Input-Sparsity Low Rank Approximation in Schatten Norm »
Yi Li · David Woodruff -
2019 Poster: A Framework for Bayesian Optimization in Embedded Subspaces »
Amin Nayebi · Alexander Munteanu · Matthias Poloczek -
2019 Oral: A Framework for Bayesian Optimization in Embedded Subspaces »
Amin Nayebi · Alexander Munteanu · Matthias Poloczek