Skip to yearly menu bar Skip to main content


Poster
in
Workshop: HiLD: High-dimensional Learning Dynamics Workshop

High-dimensional Learning Dynamics of Deep Neural Nets in the Neural Tangent Regime

Yongqi Du · Zenan Ling · Robert Qiu · Zhenyu Liao


Abstract: In this paper, built upon recent advances in high-dimensional characterization of the neural tangent kernel (NTK) with random matrix techniques in \cite{gu2022Lossless}, we derive \emph{precise} high-dimensional training dynamics for deep and wide neural networks (DNNs).By assuming a Gaussian mixture model for the input features, \emph{exact} expression of high-dimensional training mean squared error (MSE) is derived, as a function of the dimension $p$, sample size $n$, and the statistics of input features, the number of layer $L$, as well as the nonlinear activation function in each layer of the network.The theoretical results provide novel insight into the inner mechanism of DNNs, and in particular, into the interplay between activation function, network depth, and feature statistics.

Chat is not available.