ICR-RL: Deep Reinforcement Learning via In-Context-Regression
Abstract
Recent advancements in machine learning have largely been driven by foundation models (FMs) trained on large, diverse datasets, enabling them to generalize effectively to new, related tasks. However, extending this paradigm to reinforcement learning (RL), where an agent interacts with an environment to select actions, remains a significant challenge. Most existing approaches train FMs directly on sets of control tasks, but developing diverse RL environments and scaling training across them can be costly and complex. In this study, we explore a simpler alternative approach based on a classical reduction from RL to regression. We demonstrate that a foundation model pre-trained for regression tasks, when used as an in-context regression (ICR) model, can be directly applied to RL problems. Building on this insight, we introduce a gradient-free method, ICR-RL, that requires no additional training and leverages an ICR foundation model to tackle RL tasks. We evaluate our approach by applying the ICR model with the recently proposed TabPFN, which is trained on a wide range of regression tasks. Experiments conducted on the Gymnasium classic-control benchmark indicate that ICR-RL matches or outperforms state-of-the-art methods, including DQN and PPO. These results show that ICR foundation models can effectively solve RL tasks without fine-tuning, demonstrating their potential as a foundation for RL-oriented models