Skip to yearly menu bar Skip to main content


Cascading Adversarial Bias from Injection to Distillation in Language Models

Harsh Chaudhari · Jamie Hayes · Matthew Jagielski · Ilia Shumailov · Milad Nasr · Alina Oprea

Abstract

Chat is not available.