TaskLoom: Weaving Knowledge Across Tasks in World Models
Abstract
Sample efficiency remains one of the central challenges in modern deep reinforcement learning (DRL). In recent years, world model approaches have significantly improved the sample efficiency of model-based reinforcement learning (MBRL) by enabling policy learning in imagination, thereby reducing the need for direct interaction with the real environment. However, most existing world model methods are trained independently for each task or perform multi-task learning in offline settings, failing to fully exploit the latent relationships among tasks in online interactive scenarios. To address this limitation, we propose TaskLoom, a knowledge-sharing world model architecture for online reinforcement learning. TaskLoom adopts a grouped two-stage training paradigm: in the first stage, fine-grained knowledge is shared among tasks within each group, while in the second stage, coarse-grained knowledge is exchanged across groups, enabling hierarchical knowledge transfer and reuse. Experimental results show that TaskLoom outperforms baseline methods on widely used benchmarks such as Proprio Control and Visual Control, validating the effectiveness of the proposed knowledge-sharing mechanism for both low-dimensional state and high-dimensional visual inputs.