MindZero: Learning Online Mental Reasoning With Zero Annotations
Abstract
Effective real-world assistance requires AI agents with robust Theory of Mind (ToM): inferring human mental states from their behavior. Despite recent advances, several key challenges remain, including (1) online inference with robust uncertainty updates over multiple hypotheses; (2) efficient reasoning suitable for real-time assistance; and (3) the lack of ground-truth mental state annotations in real-world domains. We address these challenges by introducing MindZero, a self-supervised reinforcement learning framework that trains language models to perform efficient and robust online mental reasoning. During training, the model is rewarded for generating mental state hypotheses that maximize the likelihood of observed actions estimated by a planner, similar to model-based ToM reasoning. This method thus eliminates the need for explicit mental state annotations. After training, MindZero internalizes model-based reasoning, and performs mental inference in a single forward pass at test time. We evaluate MindZero in four challenging mental reasoning and AI assistance domains. MindZero matches the robustness of explicit model-based methods while significantly accelerating inference, outperforming state-of-the-art methods by a large margin. These results demonstrate that mental reasoning can be learned as a self-supervised skill, bridging the gap between robustness and efficiency in ToM modeling.