Skip to yearly menu bar Skip to main content


Spotlight Poster

AxBench: Steering LLMs? Even Simple Baselines Outperform Sparse Autoencoders

Zhengxuan Wu · Aryaman Arora · Atticus Geiger · Zheng Wang · Jing Huang · Dan Jurafsky · Christopher Manning · Christopher Potts
2025 Spotlight Poster

Abstract

Lay Summary

Video

Chat is not available.