Structured Expert Routing with Multi-View Task Priors for Offline Meta-Reinforcement Learning
Abstract
Offline meta-reinforcement learning requires agents to generalize to unseen tasks from fixed datasets, yet existing sequence-based and MoE-based methods rely on implicit or token-level routing signals that fail to capture task-level structure. We propose the Task-Guided Router (TGR), a structured expert-routing framework that explicitly models inter-task relationships via multi-view task representations that combine semantic descriptors, behavioral summaries, and latent dynamics features. Using structure-guided routing, TGR assigns experts based on global task compatibility rather than local trajectory fragments, enabling stable specialization and effective knowledge transfer across tasks.Extensive experiments on continuous-control benchmarks demonstrate that TGR consistently outperforms state-of-the-art offline meta-RL methods in few-shot generalization, particularly under sparse data and heterogeneous dynamics. Our results highlight the importance of task-level priors for robust offline meta-reinforcement learning.