Poster
in
Workshop: Spurious correlations, Invariance, and Stability (SCIS)

Lazy vs hasty: linearization in deep networks impacts learning schedule based on example difficulty

Thomas George ⋅ Guillaume Lajoie ⋅ Aristide Baratin

Project Page [ OpenReview]

Abstract

A recent line of work has identified a so-called ‘lazy regime’ where a deep network can be well approximated by its linearization around initialization throughout training. Here we investigate the comparative effect of the lazy (linear) and featurelearning (non-linear) regimes on subgroups of examples based on their difficulty. Specifically, we show that easier examples are given more weight in feature learning mode, resulting in faster training compared to more difficult ones. We illustrate this phenomenon across different ways to quantify example difficulty, including c-score, label noise, and in the presence of spurious correlations.

Video

Chat is not available.