Skip to yearly menu bar Skip to main content


(4 events)   Timezone:  
Show all
Toggle Poster Visibility
Oral
Thu Jul 17 10:00 AM -- 10:15 AM (PDT) @ West Exhibition Hall C None
STAIR: Improving Safety Alignment with Introspective Reasoning
Yichi Zhang · Siyuan Zhang · Yao Huang · Zeyu Xia · Zhengwei Fang · Xiao Yang · Ranjie Duan · Dong Yan · Yinpeng Dong · Jun Zhu
[ OpenReview
Oral
Thu Jul 17 10:15 AM -- 10:30 AM (PDT) @ West Exhibition Hall C None
AutoAdvExBench: Benchmarking Autonomous Exploitation of Adversarial Example Defenses
Nicholas Carlini · Edoardo Debenedetti · Javier Rando · Milad Nasr · Florian Tramer
[ OpenReview
Oral
Thu Jul 17 10:30 AM -- 10:45 AM (PDT) @ West Exhibition Hall C None
Exploring and Mitigating Adversarial Manipulation of Voting-Based Leaderboards
Yangsibo Huang · Milad Nasr · Anastasios Angelopoulos · Nicholas Carlini · Wei-Lin Chiang · Christopher A. Choquette Choo · Daphne Ippolito · Matthew Jagielski · Katherine Lee · Ken Ziyu Liu · Ion Stoica · Florian Tramer · Chiyuan Zhang
[ OpenReview
Oral
Thu Jul 17 10:45 AM -- 11:00 AM (PDT) @ West Exhibition Hall C None
Model Immunization from a Condition Number Perspective
Amber Yijia Zheng · Cedar Site Bai · Brian Bullins · Raymond A. Yeh
[ Slides [ OpenReview