Workshop Sat, Jul 11, 2026 • 8:00 AM – 5:00 PM KST HALL D1

3rd AI for Math Workshop: Toward Self-Evolving Scientific Agents

Haocheng Wang ⋅ Kun Xiang

[ OpenReview]

Abstract

Mathematics has long served as a foundation for scientific discovery and a benchmark for reasoning systems. Recent advances in LLMs and formal methods have enabled AI agents to achieve IMO-level performance in theorem proving and demonstrate strong capabilities in end-to-end natural language mathematical reasoning. Against this backdrop, our workshop explores the next generation of automated research agents capable of reasoning across mathematics and broader scientific domains. We aim to investigate how these agents can achieve self-evolution to advance scientific knowledge. We invite diverse participants from academia and industry to discuss areas related to the following: - Formal theorem proving: How can LLM theorem provers transcend Olympiad questions to support real-world mathematics research and teaching, and self-evolve to propose and solve innovative conjectures? - Precise autoformalization: How to close the gap between formal and informal mathematical reasoning? How can natural language mathematics be reliably translated into formal languages? How do we verify that the resulting formal statements faithfully preserve the original mathematical intent? - Automated mathematics in natural language: How to achieve frontier mathematical reasoning performances with a pure natural language pipeline, including data, generation, and verification? - Scientific problem solving: How do the advances of mathematical reasoning as a foundation benefit/be transferred into broader scientific fields, e.g., theoretical computer science and physics? - Multimodal reasoning: How do current reasoning systems use visual information? How can we develop methods to tackle problems in multimodal mathematical and scientific reasoning? Extending the scope further, we also welcome research related to the following topics: - Verification and measurement: How to verify the correctness and measure the faithfulness of AI-generated scientific solutions? - Human-AI collaboration: What are the effective methods for scientific human-AI collaboration? - Scientific agents in related areas: Systems science, causality, finance, bioinformatics, etc. Our workshop also includes three challenges: - Track 1: Semantic Alignment Evaluation for Autoformalization - Track 2: Theoretical Computer Science Proving in Lean - Track 3: Visual Grounded Physics Problem Solving