Poster Mon, Jul 6, 2026 • 6:30 PM – 8:15 PM PDT HALL A #115

Why Specialist Models Still Matter: A Heterogeneous Multi-Agent Paradigm for Medical Artificial Intelligence

Yanan Wang ⋅ Shuaicong Hu ⋅ Jian Liu ⋅ Guohui Zhou ⋅ Aiguo Wang ⋅ Cuiwei Yang

Abstract

The impressive performance of generalist large language models (LLMs) such as GPT and Claude in healthcare raises a critical question: will domain-specific medical specialist models become obsolete? We argue that the future of medical artificial intelligence (AI) lies not in building monolithic medical foundation models, nor in replacing human expertise, but in orchestrating collaboration among generalist LLMs, domain-specific specialist models, and clinicians. We propose HetMedAgent, a heterogeneous medical multi-agent framework that enables conflict-aware evidence fusion, uncertainty-based clinician intervention triggering, and adaptive threshold calibration. Experiments on three real-world clinical decision-making tasks demonstrate that the synergy between generalist LLMs and domain-specific specialist models significantly outperforms using either type of model alone, validating the irreplaceable value of specialist models in modality-specific analysis. HetMedAgent represents a shift from building medical LLMs or foundation models to multi-agent collaboration, achieving a balance between general reasoning capabilities and domain-specific precision.

Lay Summary

Large AI language models can answer many medical questions, but healthcare decisions often depend on detailed evidence from different sources, such as heart ultrasound reports, ECG images, and clinicians’ experience. Relying on one general-purpose AI system may miss important details or give unsafe recommendations. We developed HetMedAgent, a collaborative medical AI framework in which different “agents” work together. A general AI model coordinates the task and summarizes the reasoning, while specialist AI models analyze specific medical data such as echocardiography reports and ECG images. When the system is uncertain or the specialists disagree, the case is sent to a clinician for review instead of being handled automatically. We tested this approach on real cardiovascular decision-making tasks, including predicting hospital admission risk, likely admission cause, and disease severity. Combining general AI with specialist models performed better than using either alone, showing that specialist medical models still provide important value. We also tested the system on chest X-ray cases and found encouraging results in another clinical domain. Our work suggests that the future of medical AI should not be a single model replacing doctors, but a team-based system where general AI, specialist AI, and clinicians each contribute their strengths to support safer and more reliable decisions.