The Illusion of Choice: Assessing Impact of Manipulated LLMs
Abstract
Large language models (LLMs) are increasingly being used for decision support, either directly by users or through delegated AI agents. This introduces a new risk: corrupted pipelines and reframed evidence can bias decisions. As an initial study, we examine RAG under combined manipulation from poisoned sources and system-prompt instructions to strategically reframe the retrieved evidence. In a controlled, synthetic decision environment with known ground truth, we evaluate differences in decision accuracy between human participants and delegated AI agents across neutral and manipulated RAG systems. We find that manipulation reduces decision accuracy by up to 50 percentage points for humans and 28 percentage points for agents. Even after disclosure, only 60% of human participants in the manipulated condition recognize the manipulation. Meanwhile, agents are more likely than humans to correct their initial decisions after new evidence (93% vs. 40\%), but are less reliable at detecting manipulation, exhibiting a higher false-positive rate (48% vs. 30%). These findings show that manipulated RAG-based decision-support systems can bias both direct and delegated decisions, while revealing distinct failure modes in how humans and agents detect and correct manipulation.