Proteo-R1: Thinking Foundation Models for De Novo Protein Binder Design
Abstract
Recent advances in generative diffusion and flow-matching models have revolutionized molecular design, enabling the creation of novel proteins, small molecules, and RNA sequences with unprecedented fidelity. Yet, these models remain intuitive rather than intelligent—they generate without reasoning. \textbf{ThinkProteo} reimagines generative science by introducing reasoning-guided diffusion models that think step-by-step, akin to how a scientist hypothesizes, tests, and refines molecular ideas. By embedding chain-of-thought (CoT) reasoning into the continuous generative trajectory, ThinkProteo transforms diffusion into a process of thought: each denoising step becomes an interpretable act of molecular reasoning guided by structural, energetic, and functional objectives. This framework bridges symbolic reasoning and physical generation, yielding models that not only design molecules but also explain why they work. We envision ThinkProteo as a foundation for cognitive generative chemistry—uniting the creativity of diffusion models with the deliberation of human reasoning to accelerate the discovery of safe and effective therapeutics.