ICML Interpolated-MLPs: Controllable Inductive Bias

Poster
in
Workshop: High-dimensional Learning Dynamics Workshop: The Emergence of Structure and Reasoning

Interpolated-MLPs: Controllable Inductive Bias

Sean Wu · Jordan Hong · Keyu Bai · Gregor Bachmann

[ Abstract ] [ Project Page ]

[ Poster] [ OpenReview]

Abstract:

Due to their weak inductive bias, Multi-Layer Perceptrons (MLPs) have subpar performance at low compute levels compared to standard architectures such as convolution-based networks (CNN).Recent work however has shown that the performance gap drastically reduces as the amount of compute is increased, without changing the amount of inductive bias.In this work, we investigate whether the converse is true: can increasing the inductive bias of MLPs also improve performance at small levels of compute?To address this question, we propose a "Soft MLP" approach which we coin Interpolated MLP (I-MLP). We increase the level of inductive bias of the standard MLP by introducing a novel algorithm based on interpolation between fixed weights from a prior model with high inductive bias.We showcase our method using various prior models including CNNs and the MLP-Mixer architecture.We find that even without changing the MLP architecture, we can surpass standard MLP performance at low-compute scales by only changing the training process to add inductive bias.

Chat is not available.

Poster in Workshop: High-dimensional Learning Dynamics Workshop: The Emergence of Structure and Reasoning

Interpolated-MLPs: Controllable Inductive Bias

Sean Wu · Jordan Hong · Keyu Bai · Gregor Bachmann

Poster
in
Workshop: High-dimensional Learning Dynamics Workshop: The Emergence of Structure and Reasoning