Poster
in
Workshop: HiLD: High-dimensional Learning Dynamics Workshop
High-dimensional Learning Dynamics of Deep Neural Nets in the Neural Tangent Regime
Yongqi Du · Zenan Ling · Robert Qiu · Zhenyu Liao
Abstract:
In this paper, built upon recent advances in high-dimensional characterization of the neural tangent kernel (NTK) with random matrix techniques in \cite{gu2022Lossless}, we derive \emph{precise} high-dimensional training dynamics for deep and wide neural networks (DNNs).By assuming a Gaussian mixture model for the input features, \emph{exact} expression of high-dimensional training mean squared error (MSE) is derived, as a function of the dimension $p$, sample size $n$, and the statistics of input features, the number of layer $L$, as well as the nonlinear activation function in each layer of the network.The theoretical results provide novel insight into the inner mechanism of DNNs, and in particular, into the interplay between activation function, network depth, and feature statistics.
Chat is not available.