Poster
in
Workshop: ES-FoMo II: 2nd Workshop on Efficient Systems for Foundation Models
Projectable Models: One-Shot Generation of Small Specialized Transformers from Large Ones
Andrey Zhmoginov · Jihwan Lee · Mark Sandler
Modern Foundation Models (FMs) are typically trained on corpora spanning a wide range of different data modalities, topics, downstream tasks. Utilizing these models can be very computationally expensive and is out of reach for most consumer devices. Furthermore, most of the broad FM knowledge may actually be irrelevant for a specific task at hand. Here we explore a technique for mapping parameters of a large Transformer to parameters of a smaller specialized model. By making this transformation task-specific, we aim to capture a narrower scope of the knowledge needed for performing a specific task by a smaller model. We study our method on image modeling tasks, showing that performance of generated models exceeds that of universal conditional models.