Invited Talk Wed, Jul 8, 2026 • 4:30 PM – 5:30 PM PDT HALL C

From Behavioural Guardrails to Principled Agency

Verena Rieser

Abstract

How can agents make safe autonomous decisions in complex, dynamic environments? While significant progress has been made in establishing post-training guardrails to enforce conversational compliance in generative models, these rigid constraints often prove brittle in open-world environments. I argue that achieving generalizable agentic safety requires Normative Alignment: a new paradigm that moves beyond passive harm avoidance to equip autonomous systems with Agentic Integrity. This approach provides agents with the structural capability to interpret, reason through, and dynamically apply abstract principles when literal instructions fail. Realizing this paradigm presents a triple challenge of capability, measurement, and governance. First, it requires a shift in model capability toward normative competence beyond generic reward maximization, moving toward the contextual reasoning needed to adjudicate complex trade-offs in non-verifiable domains. Second, it demands new metrics that move optimization targets beyond immediate preference satisfaction toward long-term human well-being. Third, it requires deliberative governance to ensure these systems avoid top-down paternalism by grounding alignment targets in pluralistic, representative societal input.

Speaker

Verena Rieser

Verena Rieser is a Research Lead at Google DeepMind, where she leads efforts in responsible alignment for Gemini. She was previously a full Professor of Artificial Intelligence at Heriot-Watt University and Co-founder of a conversational AI startup. She earned her PhD from Saarland University in 2008, pioneering the use of Reinforcement Learning in dialogue systems. Her contributions to Generative and Conversational AI have been recognized with numerous international awards, including a Royal Society Leverhulme Senior Research Fellowship. Following her ACL 2025 Keynote on reimagining alignment for truly beneficial AI, her ICML 2026 address expands this vision into a human-centred research roadmap, navigating the shift from instruction-following assistants to principled, autonomous agents.