Poster
in
Workshop: High-dimensional Learning Dynamics Workshop: The Emergence of Structure and Reasoning
Analysing feature learning of gradient descent using periodic functions
Jaehui Hwang · Taeyoung Kim · Hongseok Yang
Abstract:
We present the analysis of feature learning in neural networks when target functions are defined by periodic functions applied to one-dimensional projections of the input. Previously, Damian et al (2022) considered a similar question for target functions of the form f∗(x)=p∗(⟨u1,x⟩,…,⟨ur,x⟩)f∗(x)=p∗(⟨u1,x⟩,…,⟨ur,x⟩) for some vectors u1,…,ur∈Rd and polynomial p∗, and proved that feature learning occurs during the training of a shallow neural network, even when the first-layer weights of the network are updated only once during training. Here feature learning refers to a subset of the first-layer weights w1,…,wm∈Rd of the trained network being in the same directions as {u1,…,ur}. We show that for periodic target functions, the same single gradient-based update of the first-layer weights induces feature learning of a shallow neural network, despite the additional challenge that feature learning for periodic functions now involves both directions and magnitudes of {u1,…,ur}: a useful feature of, say, f∗(x)=sin(⟨u,x⟩) is a vector w∈Rd such that ∠(w,u)≈0 and ‖w‖≈‖u‖. Our theoretical result shows that the sample complexity for learning a periodic target function Experimental results further support our theoretical finding, and illustrate the benefits of feature learning for a broader class of periodic target functions.
Chat is not available.