Poster
in
Workshop: Pluralistic Alignment Workshop

Directional Influence and Consensus Formation in Multi-Agent Systems

Prisha Priyadarshini ⋅ Aryan Shrivastava

Project Page

Abstract

Multi-agent systems are increasingly being deployed in real-world applications, and understanding inter-agent dynamics is critical for developing reliable and robust systems. While multi-agent systems have been shown to improve accuracy, the underlying interaction dynamics that drive consensus more generally remain poorly understood. In this paper, we conduct an empirical study of multi-turn agent interactions, analyzing how consensus forms through disagreement and model deference across both objective and subjective datasets. Across experiments, we find that model deference is not a fixed hierarchical property in heterogeneous settings, but instead emerges only under specific conditions. In contrast, homogeneous settings do not exhibit a consistent hierarchical structure. Under answer rotation, even though smaller models do tend to defer to larger models the majority of the time, the rate at which larger models defer to smaller models increases. This shows that model identity may not be the sole explanatory factor for model deference. Additionally, multi-agent dynamics can be actively controlled via system prompts. Overall, disagreement and model deference provide informative signals for studying multi-agent behavior beyond accuracy to determine the reliability and robustness of multi-agent systems.