Timezone: »
Deep learning models are often successfully trained using gradient descent, despite the worst case hardness of the underlying non-convex optimization problem. The key question is then under what conditions can one prove that optimization will succeed. Here we provide a strong result of this kind. We consider a neural net with one hidden layer and a convolutional structure with no overlap and a ReLU activation function. For this architecture we show that learning is NP-complete in the general case, but that when the input distribution is Gaussian, gradient descent converges to the global optimum in polynomial time. To the best of our knowledge, this is the first global optimality guarantee of gradient descent on a convolutional neural network with ReLU activations.
Author Information
Alon Brutzkus (Tel Aviv University)
Amir Globerson (Tel Aviv University)
Related Events (a corresponding poster, oral, or spotlight)
-
2017 Poster: Globally Optimal Gradient Descent for a ConvNet with Gaussian Inputs »
Mon. Aug 7th 08:30 AM -- 12:00 PM Room Gallery #92
More from the Same Authors
-
2022 Poster: Efficient Learning of CNNs using Patch Based Features »
Alon Brutzkus · Amir Globerson · Eran Malach · Alon Regev Netser · Shai Shalev-Shwartz -
2022 Spotlight: Efficient Learning of CNNs using Patch Based Features »
Alon Brutzkus · Amir Globerson · Eran Malach · Alon Regev Netser · Shai Shalev-Shwartz -
2021 Poster: On the Implicit Bias of Initialization Shape: Beyond Infinitesimal Mirror Descent »
Shahar Azulay · Edward Moroshko · Mor Shpigel Nacson · Blake Woodworth · Nati Srebro · Amir Globerson · Daniel Soudry -
2021 Oral: On the Implicit Bias of Initialization Shape: Beyond Infinitesimal Mirror Descent »
Shahar Azulay · Edward Moroshko · Mor Shpigel Nacson · Blake Woodworth · Nati Srebro · Amir Globerson · Daniel Soudry -
2021 Poster: Compositional Video Synthesis with Action Graphs »
Amir Bar · Roi Herzig · Xiaolong Wang · Anna Rohrbach · Gal Chechik · Trevor Darrell · Amir Globerson -
2021 Spotlight: Compositional Video Synthesis with Action Graphs »
Amir Bar · Roi Herzig · Xiaolong Wang · Anna Rohrbach · Gal Chechik · Trevor Darrell · Amir Globerson -
2021 Poster: Towards Understanding Learning in Neural Networks with Linear Teachers »
Roei Sarussi · Alon Brutzkus · Amir Globerson -
2021 Spotlight: Towards Understanding Learning in Neural Networks with Linear Teachers »
Roei Sarussi · Alon Brutzkus · Amir Globerson -
2019 Poster: Why do Larger Models Generalize Better? A Theoretical Perspective via the XOR Problem »
Alon Brutzkus · Amir Globerson -
2019 Oral: Why do Larger Models Generalize Better? A Theoretical Perspective via the XOR Problem »
Alon Brutzkus · Amir Globerson -
2019 Poster: Low Latency Privacy Preserving Inference »
Alon Brutzkus · Ran Gilad-Bachrach · Oren Elisha -
2019 Oral: Low Latency Privacy Preserving Inference »
Alon Brutzkus · Ran Gilad-Bachrach · Oren Elisha -
2018 Poster: Learning to Optimize Combinatorial Functions »
Nir Rosenfeld · Eric Balkanski · Amir Globerson · Yaron Singer -
2018 Poster: Predict and Constrain: Modeling Cardinality in Deep Structured Prediction »
Nataly Brukhim · Amir Globerson -
2018 Oral: Learning to Optimize Combinatorial Functions »
Nir Rosenfeld · Eric Balkanski · Amir Globerson · Yaron Singer -
2018 Oral: Predict and Constrain: Modeling Cardinality in Deep Structured Prediction »
Nataly Brukhim · Amir Globerson -
2017 Poster: Learning Infinite Layer Networks without the Kernel Trick »
Roi Livni · Daniel Carmon · Amir Globerson -
2017 Talk: Learning Infinite Layer Networks without the Kernel Trick »
Roi Livni · Daniel Carmon · Amir Globerson