Reasoning Can Be Restored by Correcting a Few Decision Tokens
Shen Changshuo ⋅ Leheng Sheng ⋅ Yuxin Chen ⋅ Xiang Wang ⋅ An Zhang
Abstract
Large reasoning models (LRMs) substantially outperform their base LLM counterparts on challenging reasoning benchmarks, yet it remains poorly understood where base models go wrong during token-by-token generation and how to narrow this gap efficiently. We study the base–reasoning gap by quantifying token-level distributional disagreement between a base model and a stronger reasoning model using likelihood-based divergences. Across benchmarks, we find that the reasoning advantage is highly sparse and concentrates on a small set of early, planning-related decision tokens. For instance, on Qwen3-0.6B, only $\sim$8\% of generated tokens account for the salient disagreement; these tokens concentrate early in the response, are strongly enriched in planning-related decisions ($17\times$), and coincide with high base-model uncertainty—suggesting that base models fail mainly at early planning points that steer the subsequent reasoning trajectory. Building on these findings, we propose disagreement-guided token intervention, a simple inference-time delegation scheme that performs a one-token takeover by the reasoning model only at high-disagreement positions and immediately switches back to the base model. With a small intervention budget, this sparse delegation substantially recovers and can even surpass the performance of a same-size reasoning model on challenging reasoning tasks. Code is available at \url{https://anonymous.4open.science/r/RRTokenIntervention-EBDD}.
Successful Page Load