Task-and-Model-Aware Fractal-Consistency for Efficient LLM Reasoning
Abstract
While self-consistency methods have emerged as a promising approach to enhance the correctness of large language model (LLM) outputs by aggregating multiple stochastic samples, they suffer from two critical limitations, resulting in high computation cost. First, they evaluate output consistency monolithically, failing to efficiently combine partially correct answers across multiple samples. Second, they use static stopping criteria that cannot adapt to varying task complexities and model capabilities, resulting in suboptimal computational efficiency. In this work, we present Task-and-Model-Aware Fractal-Consistency (TMAFC), a novel self-consistency framework that addresses these limitations through two key innovations: (1) Fractal-Consistency, which evaluates the output consistency at the granularity of output components to effectively combine partial correct answers across samples, and (2) Adaptive Stopping Criteria Calibration (ASCC), which dynamically adjusts sampling stopping criteria based on real-time assessment of both task difficulty and LLM capability. Through extensive experiments on diverse question-answering benchmarks, we demonstrate that TMAFC achieves superior efficiency-accuracy trade-offs, reducing sample cost by up to 55\% while maintaining competitive accuracy compared to state-of-the-art baselines.