Skip to yearly menu bar Skip to main content


Poster

PoisonBench: Assessing Language Model Vulnerability to Poisoned Preference Data

Tingchen Fu ⋅ Mrinank Sharma ⋅ Phil Torr ⋅ Shay Cohen ⋅ David Krueger ⋅ Fazl Barez
2025 Poster

Abstract

Lay Summary

Video

Chat is not available.