Skip to yearly menu bar Skip to main content


On the Spectral Bias of Neural Networks

Nasim Rahaman · Aristide Baratin · Devansh Arpit · Felix Draxler · Min Lin · Fred Hamprecht · Yoshua Bengio · Aaron Courville

Pacific Ballroom #72

Keywords: [ Statistical Learning Theory ] [ Optimization ] [ Deep Learning Theory ] [ Computational Learning Theory ]


Neural networks are known to be a class of highly expressive functions able to fit even random input-output mappings with 100% accuracy. In this work we present properties of neural networks that complement this aspect of expressivity. By using tools from Fourier analysis, we highlight a learning bias of deep networks towards low frequency functions -- i.e. functions that vary globally without local fluctuations -- which manifests itself as a frequency-dependent learning speed. Intuitively, this property is in line with the observation that over-parameterized networks prioritize learning simple patterns that generalize across data samples. We also investigate the role of the shape of the data manifold by presenting empirical and theoretical evidence that, somewhat counter-intuitively, learning higher frequencies gets easier with increasing manifold complexity.

Live content is unavailable. Log in and register to view live content