ICML AI Agents with Formal Security Guarantees

Poster
in
Workshop: Next Generation of AI Safety

AI Agents with Formal Security Guarantees

Mislav Balunovic · Luca Beurer-Kellner · Marc Fischer · Martin Vechev

Keywords: [ Privacy ] [ prompt injections ] [ security ] [ agents ]

[ Abstract ] [ Project Page ]

[ OpenReview]

Abstract:

We propose a novel system that enables secure and controllable AI agents by enhancing them with a formal security analyzer. In contrast to prior work, our system does not try to detect prompt injections on a best-effort basis, but instead imposes hard constraints on the agent actions thereby preventing the effects of the injection. The constraints can be specified in a novel and flexible domain specific language for security rules. Before the agent takes action, the analyzer checks the current agent state for violations of any of the provided policy rules and raises an error if the proposed action is not allowed. When the analyzer determines an action to be safe, it does so using formal guarantee that none of the rules specified in the policy are violated. We show that our analyzer is effective, and detects and prevents security vulnerabilities in real-world agents.

Chat is not available.

Poster in Workshop: Next Generation of AI Safety

AI Agents with Formal Security Guarantees

Mislav Balunovic · Luca Beurer-Kellner · Marc Fischer · Martin Vechev

Poster
in
Workshop: Next Generation of AI Safety