Poster

Position: Don't Just "Fix it in Post'': A Science of AI Must Study Learning Dynamics

Stella Biderman ⋅ Mohammad Aflah Khan ⋅ Niloofar Mireshghallah ⋅ Catherine Arnett ⋅ Fazl Barez ⋅ Naomi Saphra

Abstract

What would it mean to have a scientific understanding of AI? Language models are not static objects—they are snapshots of time-evolving processes shaped by data, objectives, and optimization dynamics. Yet the field predominantly treats models as fixed artifacts, analyzing behaviors after training rather than asking why they emerge. This position paper argues that AI research should move beyond post hoc fixes and study the learning dynamics of models. We envision a hierarchy of scientific maturity: first predict outcomes from early training signals, then intervene when trajectories go wrong, ultimately design training procedures that guarantee desired properties. Scaling laws have reached the first level for loss; the challenge is extending all three levels to general capabilities, biases, and safety. We articulate requirements for such theories, survey progress across mechanistic interpretability, fairness, memorization, and learning dynamics, and identify concrete open problems. The path forward requires treating models as processes to be understood, not just artifacts to be patched.