Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Next Generation of AI Safety

WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models

Liwei Jiang ⋅ Kavel Rao ⋅ Seungju Han ⋅ Allyson Ettinger ⋅ Faeze Brahman ⋅ Sachin Kumar ⋅ Niloofar Mireshghallah ⋅ Ximing Lu ⋅ Maarten Sap ⋅ Nouha Dziri ⋅ Yejin Choi

Abstract

Video

Chat is not available.