firstbacksecondback
6 Results
Poster
|
Wed 4:30 |
TERD: A Unified Framework for Safeguarding Diffusion Models Against Backdoors Yichuan Mo · Hui Huang · Mingjie Li · Ang Li · Yisen Wang |
|
Poster
|
Wed 4:30 |
On Prompt-Driven Safeguarding for Large Language Models Chujie Zheng · Fan Yin · Hao Zhou · Fandong Meng · Jie Zhou · Kai-Wei Chang · Minlie Huang · Nanyun Peng |
|
Poster
|
Tue 2:30 |
Covert Malicious Finetuning: Challenges in Safeguarding LLM Adaptation Danny Halawi · Alexander Wei · Eric Wallace · Tony Wang · Nika Haghtalab · Jacob Steinhardt |
|
Workshop
|
Can Language Models Safeguard Themselves, Instantly and For Free? Dyah Adila · Changho Shin · Yijing Zhang · Frederic Sala |
||
Workshop
|
BELLS: A Framework Towards Future Proof Benchmarks for the Evaluation of LLM Safeguards Diego Dorn · Alexandre Variengien · Charbel-Raphaël Segerie · Vincent Corruble |
||
Workshop
|
Video Watermarking: Safeguarding Your Video from (Unauthorized) Annotations by Video-based LLMs Jinmin Li · Kuofeng Gao · Yang Bai · Jingyun Zhang · Shutao Xia |