Reasoning Compartmentalization: Bridging the Concretization Gap via Abstraction-based Routing
Abstract
While previous research has documented the sensitivity of Large Language Models (LLMs) to surface-level performance degradation, the underlying impact on internal representations and learning dynamics remains under-explored. In this work, we study this question using a controlled setup with paired reasoning tasks that are logically identical but expressed either in an abstract formal language (FL) or in natural language (NL). We find that converting FL problems into NL consistently degrades reasoning accuracy. More importantly, we show that FL and NL inputs activate largely separate internal representations and exhibit weak learning transfer between them. We refer to this phenomenon as reasoning compartmentalization. To test whether this compartmentalization can be mitigated, we introduce abstraction-based alignment, where models are trained to translate NL inputs into their corresponding FL forms. While this significantly improves reasoning performance, FL and NL representations remain largely distinct, and learning transfer across formulations remains limited. Through activation-level interventions, we further show that performance improvements arise not from representational fusion, but from improved routing. This suggests that abstraction alleviates formulation sensitivity by strengthening connections between formulation-specific reasoning pathways, rather than by aligning their representations.