Poster
in
Workshop: AI as a Tool for Mathematics, Computer Science, and Machine Learning Thu, Jul 9, 2026 • 7:50 PM – 9:00 PM PDT

Socrates: Structured Questioning Unlocks Latent Knowledge in AI Research Agents

Damir Vrabac ⋅ Prannay Hebbar ⋅ Yogendra Manawat ⋅ Selvam Palanimalai ⋅ Samuel Verboomen ⋅ Gurusha Juneja ⋅ Kunal Bhatia ⋅ Vignesh Baskaran

Project Page

Abstract

LLM agents on open-ended research tasks consistently fail to apply knowledge they demonstrably possess: frontier models score above 88% on MMLU machine-learning content yet earn Kaggle medals on only 16.9% of MLE-bench tasks. We argue the bottleneck is knowledge activation, not capacity. We introduce Socrates, a multi-agent protocol pairing a tool-using Scientist with a question-only advisor that cannot provide answers, directives, or use tools. Asking probing questions forces the Scientist to surface its own latent knowledge into context. On five MLE-bench tasks, Socrates improves Kaggle test scores on 4 of 5 tasks (mean +55.9%) and outperforms a generic-PI baseline on 4 of 5, confirming the gain comes from the nature of questioning rather than extra interaction.