Plenary Speaker
in
Workshop: High-dimensional Learning Dynamics Workshop: The Emergence of Structure and Reasoning
Misleading Endpoints – Lessons from LLM Training Dynamics, Angelica Chen
Angelica Chen
Abstract: Many machine learning methods focus on metrics acquired at the end of the training — however, interpreting only these metrics can be misleading. In this talk, we focus on two examples of how analyzing training dynamics can yield deeper insights about LLM behavior than interpreting the endpoints alone. In the first, we demonstrate how a common interpretability artifact may appear to be uncorrelated with model performance at the end of training, but in fact exhibits a causal relationship with key learning strategies at the beginning of training. In the second, we study an example where the theoretical properties of the optimal policy differ dramatically from those of the fully trained model. We then show how the model’s learning dynamics on different partitions of the training dataset offers an explanation that reconciles this difference. In both cases, solely interpreting the endpoint of training (either theoretical or empirical) may misrepresent what the model actually learns during training.
Bio: Angelica Chen is a PhD student at NYU, advised by Kyunghyun Cho. She is broadly interested in understanding LLM training and using these insights to improve how LLMs learn from feedback. She has previously interned at Google DeepMind and Google Research, and completed her undergrad at Princeton, where her work earned an Outstanding Computer Science Thesis award.