ICML Three Mechanisms of Feature Learning in an Analytically Solvable Model

Poster
in
Workshop: High-dimensional Learning Dynamics Workshop: The Emergence of Structure and Reasoning

Three Mechanisms of Feature Learning in an Analytically Solvable Model

Yizhou Xu · Liu Ziyin

[ Abstract ] [ Project Page ]

[ OpenReview]

Abstract:

We identify and exactly solve the learning dynamics of a one-hidden-layer linear model at any finite width whose limits exhibit both the kernel phase and the feature learning phase. We analyze the phase diagram of this model in different limits of common hyperparameters including width, layer-wise learning rates, scale of output, and scale of initialization. Our solution identifies three novel prototype mechanisms of feature learning: (1) learning by alignment, (2) learning by disalignment, and (3) learning by rescaling. In sharp contrast, none of these mechanisms is present in the kernel regime of the model. We empirically demonstrate that these discoveries also appear in deep nonlinear networks in real tasks.

Chat is not available.

Poster in Workshop: High-dimensional Learning Dynamics Workshop: The Emergence of Structure and Reasoning

Three Mechanisms of Feature Learning in an Analytically Solvable Model

Yizhou Xu · Liu Ziyin

Poster
in
Workshop: High-dimensional Learning Dynamics Workshop: The Emergence of Structure and Reasoning