Skip to yearly menu bar Skip to main content


Poster
in
Workshop: High-dimensional Learning Dynamics Workshop: The Emergence of Structure and Reasoning

Fine-grained Analysis of In-context Linear Estimation

Yingcong Li · Ankit Singh Rawat · Samet Oymak


Abstract:

In this work, we develop a characterization of the optimization and generalization landscape of ICL through contributions on architectures, low-rank parameterization, and correlated designs: (1) We study the landscape of 1-layer linear attention and 1-layer H3, a state-space model. Under a suitable correlated design assumption, we prove that both implement 1-step preconditioned gradient descent. (2) By studying correlated designs, we provide new risk bounds for retrieval augmented generation which reveal how ICL sample complexity significantly benefits from distributional alignment. (3) We derive the optimal risk for low-rank parameterized attention weights in terms of covariance spectrum. Through this, we also shed light on how LoRA can adapt to a new distribution by capturing the shift between task covariances.

Chat is not available.