Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Next Generation of AI Safety

WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models

Liwei Jiang · Kavel Rao · Seungju Han · Allyson Ettinger · Faeze Brahman · Sachin Kumar · Niloofar Mireshghallah · Ximing Lu · Maarten Sap · Nouha Dziri · Yejin Choi

Abstract

Video

Chat is not available.