Skip to yearly menu bar Skip to main content


Poster

Codebook Features: Sparse and Discrete Interpretability for Neural Networks

Alex Tamkin · Mohammad Taufeeque · Noah Goodman
2024 Poster

Abstract

Chat is not available.