Transformers with RL or SFT Provably Learn Sparse Boolean Functions, But Differently
Bochen Lyu ⋅ Yiyang Jia ⋅ Xiaohao Cai ⋅ Zhanxing Zhu
Abstract
Transformers can acquire Chain-of-Thought (CoT) capabilities to solve complex reasoning tasks through fine-tuning. Reinforcement learning (RL) and supervised fine-tuning (SFT) are two primary approaches to this end. In this work, we examine them specifically for learning k-sparse Boolean functions with a one-layer transformer and intermediate supervision that is akin to CoT. In particular, we consider k-sparse Boolean functions that can be recursively decomposed into fixed 2-sparse Boolean functions. We first analyze the learning dynamics of fine-tuning the transformer via either RL or SFT with CoT in a unified way. This allows us to identify sufficient conditions for the transformer to provably learn the general sparse Boolean functions. We then verify that these conditions hold for three basic examples, including $k$-PARITY, $k$-AND, and $k$-OR, thus demonstrating the learnability of them via both RL and SFT. Notably, we reveal that RL and SFT exhibit distinct learning behaviors: RL learns the whole CoT chain simultaneously, whereas SFT naturally learns the CoT chain step-by-step. Overall, our findings provide theoretical insights into the underlying mechanisms of RL and SFT and how they differ in triggering the CoT capabilities of transformers.
Successful Page Load