AgentRx: A Benchmark for Multimodal Clinical Forecasting with LLM Agents
Baraa Al Jorf ⋅ Farah E. Shamout
Abstract
Effective clinical forecasting requires integrating heterogeneous multimodal data, including electronic health records, images, and clinical notes. While Large Language Model (LLM) agents present a promising solution to mitigate healthcare data fragmentation, their effectiveness in multimodal clinical risk forecasting remains largely unexplored. To address this, we introduce AgentRx, a systematic benchmark evaluating single and multi-agent LLM frameworks across unimodal and multimodal clinical prediction tasks using real-world data. Our findings highlight that single agent frameworks outperform naive multi-agent systems, are better at handling multimodal data, and are better calibrated.
Successful Page Load