Skip to yearly menu bar Skip to main content


Poster
in
Workshop: 1st ICML Workshop on In-Context Learning (ICL @ ICML 2024)

An In-Context Learning Theoretic Analysis of Chain-of-Thought

Chenxiao Yang · Zhiyuan Li · David Wipf


Abstract:

Large language models (LLMs) have demonstrated remarkable reasoning capabilities with proper prompting strategies such as by augmenting demonstrations with chain-of-thought (CoT). However, the understanding of how different intermediate steps in the CoT improve reasoning and the principles guiding their design remains elusive. This paper takes an initial step towards addressing these questions by introducing a new analytical framework from a learning theoretic perspective. Particularly, we identify a class of in-context learning (ICL) algorithms on few-shot CoT prompts, capable of learning complex non-linear functions by composing simpler predictors obtained through gradient descent based optimization. We show this algorithm can be expressed by Transformers in their forward pass with simple weight constructions. We further analyse of the generalization properties of the ICL algorithm for learning different families of target functions. The derived theoretical results suggest several provably effective ways for decomposing target problems and forming CoT prompts, highlighting the bottleneck lies at the hardest reasoning step. Empirically, we demonstrate that CoT forms derived from our theoretical insights significantly enhance the reasoning capabilities of real-world LLMs in solving challenging arithmetic reasoning tasks.

Chat is not available.