Poster
in
Workshop: High-dimensional Learning Dynamics Workshop: The Emergence of Structure and Reasoning
Analysing feature learning of gradient descent using periodic functions
Jaehui Hwang · Taeyoung Kim · Hongseok Yang
Abstract:
We present the analysis of feature learning in neural networks when target functions are defined by periodic functions applied to one-dimensional projections of the input. Previously, Damian et al (2022) considered a similar question for target functions of the form $f^*(x) = p^*(\langle u_1,x\rangle,\ldots,\langle u_r,x\rangle)$ for some vectors $u_1,\ldots,u_r \in \mathbb{R}^d$ and polynomial $p^*$, and proved that feature learning occurs during the training of a shallow neural network, even when the first-layer weights of the network are updated only once during training. Here feature learning refers to a subset of the first-layer weights $w_1,\ldots,w_m \in \mathbb{R}^d$ of the trained network being in the same directions as $\{u_1,\ldots,u_r\}$. We show that for periodic target functions, the same single gradient-based update of the first-layer weights induces feature learning of a shallow neural network, despite the additional challenge that feature learning for periodic functions now involves both directions and magnitudes of $\{u_1,\ldots,u_r\}$: a useful feature of, say, $f^*(x) = \sin(\langle u,x\rangle)$ is a vector $w \in \mathbb{R}^d$ such that $\angle(w, u) \approx 0$ and $\|w\| \approx \|u\|$. Our theoretical result shows that the sample complexity for learning a periodic target function Experimental results further support our theoretical finding, and illustrate the benefits of feature learning for a broader class of periodic target functions.
Chat is not available.