Poster
in
Workshop: 1st ICML Workshop on In-Context Learning (ICL @ ICML 2024)
Can Mamba In-Context Learn Task Mixtures?
Yingcong Li · Xupeng Wei · Haonan Zhao · Taigao Ma
In-context learning (ICL) refers to the ability to perform new tasks based on a prompt sequence consisting of ``in-context'' input-output pairs, without explicit model training. Previous work has shown that State-Space Models (SSMs), particularly Mamba, are potential competitors over Transformers in ICL. However, the capability to handle mixed tasks in complicated ICL prompts remains unanswered. In this work, we explore the Mamba performance in mixed ICL tasks, in a degree from low to high, and from labeled to unlabeled, compared to that of Transformers. We show that Mamba is capable of learning ICL mixtures, reaching the performance of single ICL task and Transformer baselines. Moreover, Mamba converges faster and shows more stable performances than Transformers, allowing Mamba to handle longer context lengths and more complicated prompt structures. Different learning dynamics in different ICL tasks are also observed.