Skip to yearly menu bar Skip to main content


Poster

Decoy for the Judge: Disrupting Multi-Turn Jailbreaks using Semantics-Preserving Output Rewriting

Huanli Gong ⋅ Zhipeng Wei ⋅ Yu Fu ⋅ Haz Shahgir ⋅ Ananya Gupta ⋅ Yue Dong ⋅ N. Benjamin Erichson

Abstract

Log in and register to view live content