Skip to yearly menu bar Skip to main content


Poster Tue, Jul 15, 2025 • 11:00 AM – 1:30 PM PDT

SAEBench: A Comprehensive Benchmark for Sparse Autoencoders in Language Model Interpretability

Adam Karvonen · Can Rager · Johnny Lin · Curt Tigges · Joseph Bloom · David Chanin · Yeu-Tong Lau · Eoin Farrell · Callum McDougall · Kola Ayonrinde · Demian Till · Matthew Wearden · Arthur Conmy · Samuel Marks · Neel Nanda

Abstract

Lay Summary

Video

Chat is not available.