LLMInertia: Adaptive Counter-Inertial Reasoning to Improve Evidence Faithfulness in Large Language Models
Abstract
Large Language Models (LLMs) frequently generate output that contradicts explicit input evidence, limiting their reliability in real-world applications. We identify cognitive inertia in LLMs—a tendency to overly rely on co-occurrence associations learned during pretraining and to resist adaptation when conflicting input evidence appears—as a critical factor behind such hallucinations. We further empirically show that adherence to input evidence declines as co-occurrence associations are strengthened—driven by either higher data frequency or intensified training. Inspired by human counter-inertial thinking, we propose an adaptive counter-inertial reasoning framework that probes input-related cognitive inertia in the LLM and generates adaptive counter-inertial reminders, which are then injected into the prompt to promote evidence-based reasoning. Experiments on co-occurrence induction datasets show that LLMInertia reduces hallucination rates by up to 35\% and improves accuracy by up to 35.68\%. Extensive evaluations on four context-rich summarization and QA datasets, across three LLM backbones of varying scales, further validate its effectiveness and robustness. Our work provides new insight into the causes of input-unfaithful hallucinations in LLMs, contributing to the development of more reliable AI.