Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Knowledge and Logical Reasoning in the Era of Data-driven Learning

Training LLMs with Noisy Algorithmic Chain of Thought

Alex Havrilla


Abstract:

Much recent effort has gone into distilling large language model chain of thought (CoT) capabilities by training smaller models directly on sampled traces. However less attention is paid to the quality or \textit{noisiness} of distilled CoT and how this impacts supervised performance. We begin study on this problem in the highly controlled setting of algorithmically solvable tasks on lists of integers. To do so we develop the \textit{TInt} framework to generate highly customizable noisy algorithmic chains of thought for evaluating arbitrary functions on lists of integers. Using this framework, we first benchmark performance baselines for arithmetic and list median finding tasks with and without CoT, while studying best practices for designing good algorithmic CoT. We then introduce three types of noise to the tasks, studying the effect on performance. We find training with algorithmic CoT is remarkably robust to \textit{static noise}, which preserves CoT form while mutating content, even when the entire dataset is contaminated. However \textit{dynamic noise}, which alters both the form and content of CoT, is more destructive even at lower dataset noise levels.

Chat is not available.