Poster
in
Workshop: Workshop on Reinforcement Learning Theory
Multi-Task Offline Reinforcement Learning with Conservative Data Sharing
Tianhe (Kevin) Yu · Aviral Kumar · Yevgen Chebotar · Karol Hausman · Sergey Levine · Chelsea Finn
Offline reinforcement learning (RL) algorithms have shown promising results in domains where abundant pre-collected data is available. However, prior methods focus on solving individual problems from scratch with an offline dataset without considering how an offline RL agent can acquire multiple skills. We argue that a natural use case of offline RL is in settings where we can pool large amounts of data collected in a number of different scenarios for solving various tasks, and utilize all this data to learn strategies for all the tasks more effectively rather than training each one in isolation. To this end, we study the offline multi-task RL problem, with the goal of devising data-sharing strategies for effectively learning behaviors across all of the tasks. While it is possible to share all data across all tasks, we find that this simple strategy can actually exacerbate the distributional shift between the learned policy and the dataset, which in turn can lead to very poor performance. To address this challenge, we develop a simple technique for data-sharing in multi-task offline RL that routes data based on the improvement over the task-specific data. We call this approach conservative data sharing (CDS), and it can be applied with any single-task offline RL method. On a range of challenging multi-task locomotion, navigation, and image-based robotic manipulation problems, CDS achieves the best or comparable performance compared to prior offline multi-task RL methods and previously proposed online multi-task data sharing approaches.