Skip to yearly menu bar Skip to main content


Spotlight

On the Generalization Power of Overfitted Two-Layer Neural Tangent Kernel Models

Peizhong Ju · Xiaojun Lin · Ness Shroff

Abstract: In this paper, we study the generalization performance of min 22-norm overfitting solutions for the neural tangent kernel (NTK) model of a two-layer neural network with ReLU activation that has no bias term. We show that, depending on the ground-truth function, the test error of overfitted NTK models exhibits characteristics that are different from the "double-descent" of other overparameterized linear models with simple Fourier or Gaussian features. Specifically, for a class of learnable functions, we provide a new upper bound of the generalization error that approaches a small limiting value, even when the number of neurons pp approaches infinity. This limiting value further decreases with the number of training samples nn. For functions outside of this class, we provide a lower bound on the generalization error that does not diminish to zero even when nn and pp are both large.

Chat is not available.