Skip to yearly menu bar Skip to main content


(4 events)   Timezone:  
Show all
The 2026 schedule is still incomplete
Toggle Poster Visibility
Oral
Wed Jul 08 10:00 AM -- 10:15 AM (KST) None
Jailbreak Foundry: From Papers to Runnable Attacks for Reproducible Benchmarking
Zhicheng Fang ⋅ Jingjie Zheng ⋅ Chenxu Fu ⋅ Wei Xu
[ OpenReview
Oral
Wed Jul 08 10:15 AM -- 10:30 AM (KST) None
Quantifying Frontier LLM Capabilities for Container Sandbox Escape
Rahul Marchand ⋅ Art Cathain ⋅ Jerome Wynne ⋅ Philippos Giavridis ⋅ Sam Deverett ⋅ John Wilkinson ⋅ Jason Gwartz ⋅ Harry Coppock
[ OpenReview
Oral
Wed Jul 08 10:30 AM -- 10:45 AM (KST) None
Robust Harmful Features Under Jailbreak Attacks: Mechanistic Evidence from Attention Head Specialization in Large Language Models
Yanchen Yin ⋅ Dongqi Han ⋅ Linghui Li
[ OpenReview
Oral
Wed Jul 08 10:45 AM -- 11:00 AM (KST) None
When the Prompt Becomes Visual: Vision-Centric Jailbreak Attacks for Large Image Editing Models
Jiacheng Hou ⋅ Yining Sun ⋅ Ruochong Jin ⋅ Haochen Han ⋅ Fangming Liu ⋅ Victor Chan ⋅ Alex Jinpeng Wang
[ OpenReview