Skip to yearly menu bar Skip to main content


Poster

Eureka-Moments in Transformers: Multi-Step Tasks Reveal Softmax Induced Optimization Problems

David T. Hoffmann ⋅ Simon Schrodi ⋅ Jelena Bratulić ⋅ Nadine Behrmann ⋅ Volker Fischer ⋅ Thomas Brox
2024 Poster

Abstract

Video

Chat is not available.