Poster
in
Workshop: Second Workshop on Technical AI Governance Research

The Illusion of Choice: Assessing Impact of Manipulated LLMs

Yee Man Choi ⋅ Abdul B Ali ⋅ Krzysztof Czarnecki ⋅ Freda Shi

Project Page

Abstract

Large language models (LLMs) are increasingly being used for decision support, either directly by users or through delegated AI agents. This introduces a new risk: corrupted pipelines and reframed evidence can bias decisions. As an initial study, we examine RAG under combined manipulation from poisoned sources and system-prompt instructions to strategically reframe the retrieved evidence. In a controlled, synthetic decision environment with known ground truth, we evaluate differences in decision accuracy between human participants and delegated AI agents across neutral and manipulated RAG systems. In this pilot study with 20 human participants, we find that manipulation reduces decision accuracy by up to 50 percentage points for humans and 28 percentage points for agents. Even after disclosure, only 60% of human participants in the manipulated condition recognize the manipulation. Meanwhile, agents are more likely than humans to correct their initial decisions when new evidence becomes available (93% vs. 40%). On the explicit post-disclosure condition-guess measure, however, LLM participants also show a higher false-positive rate than humans (48% vs. 30%). These findings show that manipulated RAG-based decision-support systems can bias both direct and delegated decisions, while revealing distinct failure modes in how humans and agents detect and correct manipulation.