Poster
in
Workshop: 1st ICML Workshop on In-Context Learning (ICL @ ICML 2024)
Learning Fast and Slow: Representations for In-Context Weight Modulation
Andrey Zhmoginov · Jihwan Lee · Max Vladymyrov · Mark Sandler
Most natural sequential processes involve a spectrum of different time scales: from fast-changing variations responsible for local structure to slowly-changing dynamics akin to memory that captures context information. Here we propose a method for learning such disentangled slow-fast representation in activations of a conventional Transformer model. We accomplish this by employing regularization techniques inspired by contrastive learning. This proposed approach can be further analyzed by adopting a Gaussian process prior resulting in a Variational Autoencoder interpretation of a Transformer model. We evaluate our techniques on synthetic in-context learning tasks and widely-used text benchmarks, where we show the emergence of disentangled representations. We then propose a HyperNetwork-inspired approach, where the slow representations are employed to modulate the weights of the transformer performed on the fast short-range activations. We demonstrate that adding such modulation makes it possible to generate models specialized to a particular context on the fly.