Skip to yearly menu bar Skip to main content


Poster

Introspection Adapters: Training LLMs to Report Their Learned Behaviors

Keshav Shenoy ⋅ Li Yang ⋅ Abhay Sheshadri ⋅ Jack Lindsey ⋅ Samuel Marks ⋅ Rowan Wang

Abstract

Log in and register to view live content