Skip to yearly menu bar Skip to main content


Cascading Adversarial Bias from Injection to Distillation in Language Models

Harsh Chaudhari ⋅ Jamie Hayes ⋅ Matthew Jagielski ⋅ Ilia Shumailov ⋅ Milad Nasr ⋅ Alina Oprea

Abstract

Chat is not available.