Poster
in
Affinity Event: The 6th Muslims in ML (MusIML) Workshop

AgentRx: A Benchmark for Multimodal Clinical Forecasting with LLM Agents

Baraa Al Jorf ⋅ Farah E. Shamout

Project Page

Abstract

Effective clinical forecasting requires integrating heterogeneous multimodal data, including electronic health records, images, and clinical notes. While Large Language Model (LLM) agents present a promising solution to mitigate healthcare data fragmentation, their effectiveness in multimodal clinical risk forecasting remains largely unexplored. To address this, we introduce AgentRx, a systematic benchmark evaluating single and multi-agent LLM frameworks across unimodal and multimodal clinical prediction tasks using real-world data. Our findings highlight that single agent frameworks outperform naive multi-agent systems, are better at handling multimodal data, and are better calibrated.