Skip to yearly menu bar Skip to main content


Poster
in
Workshop: High-dimensional Learning Dynamics Workshop: The Emergence of Structure and Reasoning

Why Pruning and Conditional Computation Work: A High-Dimensional Perspective

Erdem Koyuncu


Abstract:

We analyze the processes of pruning and conditional computation for the case of a single neuron in the asymptotic learning regime of large input dimension and training set size. For this purpose, we introduce conditional neurons, which implement an early exit strategy at the neuron level. Specifically, a conditional neuron considers the local field induced by a subset of its inputs. If this sub-local field is strong enough, then the rest of the inputs are ignored, saving computation. Conditional neurons provide an archetype of the well-known early exit or conditional computation architectures. As such, we formally analyze their generalization performance to understand why conditional computation is so effective in preserving performance despite significantly reduced average amount of computation. In the process, we introduce a concentration theorem for one-shot neuron-wise pruning, which is recently popularized in the context of large language models.

Chat is not available.