SPR-RAFT: Parameter-Efficient Regression-Aware Fine-Tuning for Biomedical LLM Regression
Yuanlin Yang ⋅ Chenhui Li ⋅ Xuhao Guo ⋅ ANQI ZHANG ⋅ Hoi Leong Lee ⋅ Haodong Liu
Abstract
Biomedical regression tasks require predicting continuous targets from heterogeneous and unstructured evidence. While Large Language Models (LLMs) provide a robust interface for reasoning over mixed modalities, they are inherently limited by their discrete tokenization and cross-entropy objectives, which lack awareness of numerical proximity. To bridge this gap, we present \textbf{SPR-RAFT}, a parameter-efficient and regression-aware framework that adapts frozen LLMs for high-precision regression. SPR-RAFT introduces a dual-module architecture: a learnable soft prompt that conditions the LLM to route numerical reasoning into a specific latent state, and a lightweight \texttt{[REG]}-anchored head for numerical reasoning consolidation. Crucially, we align these two modalities via a hybrid objective that combines distribution-based text generation with representation-based robust regression. This ensures the model remains both semantically coherent and numerically calibrated. With only $\sim$0.04\% trainable parameters, SPR-RAFT consistently outperforms prompting strategies, standard fine-tuning, and non-LLM baselines across diverse biomedical benchmarks, including clinical trial duration, biological age estimation, and molecular property prediction.
Successful Page Load