How Chain of Thought Decomposes Complex Tasks
Abstract
Reasoning tasks are characterized by data that lie on a tree. The root represents a prompt, and the ground-truth answer is one of the leaves. Each edge in the tree represents a plausible next reasoning step. We show that Chain of Thought (CoT)-based reasoning is most effective at predicting the answer to a query when this tree has a roughly equal degree at each level. Directly predicting the answer from the prompt is effective only when the tree has a small number of leaves. CoT-based predictors have been observed to perform well on deeper trees, i.e., they reason for an extended number of steps (they ``think''). We identify a critical threshold for the degree, below which such extended reasoning is detrimental, and above which there exists an optimal depth that minimizes error. It is impossible to surpass this minimal error by increasing the depth of thinking.