Poster Tue, Jul 7, 2026 • 6:30 PM – 8:15 PM PDT HALL A #804

MARS-SQL: A Multi-Agent Reinforcement Learning Framework For Text-To-SQL

Haolin Yang ⋅ Jipeng Zhang ⋅ Zhitao He ⋅ Alexander Zhou ⋅ Yi Fung

Abstract

Large Language Models (LLMs) often struggle with the precise logic and schema alignment required for complex Text-to-SQL tasks. While current methods rely heavily on static prompting, they lack the ability to dynamically adapt and self-correct through environmental interaction. To bridge this gap, we propose MARS-SQL, a trainable multi-agent framework for Text-to-SQL. Rather than introducing a new standalone SQL primitive, MARS-SQL makes an agentic workflow trainable by decomposing the problem into three specialized roles: schema grounding, query generation, and solution validation. Central to our approach is a generation agent trained via a multi-turn RL policy within a ReAct-style loop. The agent learns to iteratively reason, execute intermediate SQL actions on a live database, and refine its strategy based on execution feedback. To improve robustness, we further introduce a validation mechanism that treats solution selection as a generative modeling task, identifying the optimal interaction trajectory through next-token prediction probabilities. Empirical evaluations demonstrate the effectiveness of coupling interactive learning with trajectory ranking. MARS-SQL achieves state-of-the-art performance, recording an execution accuracy of 77.84\% on the BIRD development dataset and 89.75\% on the Spider test dataset, while also transferring strongly to out-of-domain benchmarks. Code is available at https://github.com/YangHaolin0526/MARS-SQL.

Lay Summary

Many people need answers from large databases, but writing database queries usually requires technical expertise. Recent AI systems can translate everyday questions into database queries, but they often make mistakes when the database is large, messy, or has many similar tables and columns. We introduce MARS-SQL, an AI system that solves this problem by working more like a human data analyst. Instead of producing a query in one step, it breaks the task into three parts: first identifying the relevant parts of the database, then trying possible queries while checking the database’s responses, and finally selecting the most reliable answer. During training, the system learns from whether its final query actually produces the correct result, which helps it recover from errors and improve its reasoning. This work matters because it can make professional databases easier to use for people who do not know SQL, while also making AI-generated database answers more reliable. In our experiments, MARS-SQL achieves strong results on several standard benchmarks, showing that interactive, self-correcting AI systems can be a promising direction for practical data analysis.