Timezone: »

Unsupervised Model-based Pre-training for Data-efficient Reinforcement Learning from Pixels
Sai Rajeswar · Pietro Mazzaglia · Tim Verbelen · Alex Piche · Bart Dhoedt · Aaron Courville · Alexandre Lacoste
Event URL: https://openreview.net/forum?id=8ZuNl-FXGbB »

Reinforcement learning aims at autonomously performing complex tasks. To this end, a reward signal is used to steer the learning process. While successful in many circumstances, the approach is typically data hungry, requiring large amounts of task-specific interaction between agent and environment to learn efficient behaviors. To alleviate this, unsupervised reinforcement learning proposes to collect data through self-supervised interaction to accelerate task-specific adaptation. However, whether current unsupervised strategies lead to improved generalization capabilities is still unclear, more so when the input observations are high-dimensional. In this work, we advance the field by closing the performance gap in the Unsupervised Reinforcement Learning Benchmark, a collection of tasks to be solved in a data-efficient manner, after interacting with the environment in a self-supervised way. Our model-based approach combines exploration and planning to efficiently fine-tune unsupervised pre-trained models, achieving comparable results to task-specific baselines. We extensively evaluate our work, comparing several exploration methods and improving fine-tuning by studying the interaction between the model components. Furthermore, we investigate the limits of the learned model and the unsupervised methods to gain insights into how these influence the decision process, shedding light on new research directions.

Author Information

Sai Rajeswar (University of Montreal)
Pietro Mazzaglia (Ghent University)
Tim Verbelen (Ghent University - imec)
Alex Piche (Mila)
Bart Dhoedt (Ghent University)
Aaron Courville (University of Montreal)
Alexandre Lacoste (Element AI)

More from the Same Authors