Timezone: »
Deep Reinforcement Learning has shown significant progress in extracting useful representations from high-dimensional inputs albeit using hand-crafted auxiliary tasks and pseudo rewards. Automatically learning such representations in an object-centric manner geared towards control and fast adaptation remains an open research problem. In this paper, we introduce a method that tries to discover meaningful features from objects, translating them to temporally coherent `question' functions and leveraging the subsequent learned general value functions for control. We compare our approach with state-of-the-art techniques alongside other ablations and show competitive performance in both stationary and non-stationary settings. Finally, we also investigate the discovered general value functions and through qualitative analysis show that the learned representations are not only interpretable but also, centered around objects that are invariant to changes across tasks facilitating fast adaptation.
Author Information
Somjit Nath (Mila/ÉTS)
Gopeshh Subbaraj (Mila/UdeM)
Khimya Khetarpal (Google Deepmind)
Ph.D. Student
Samira Ebrahimi Kahou (Microsoft Research)
More from the Same Authors
-
2022 : Latent Variable Models for Bayesian Causal Discovery »
Jithendaraa Subramanian · Jithendaraa Subramanian · Yashas Annadani · Ivaxi Sheth · Stefan Bauer · Derek Nowrouzezahrai · Samira Ebrahimi Kahou -
2020 Poster: What can I do here? A Theory of Affordances in Reinforcement Learning »
Khimya Khetarpal · Zafarali Ahmed · Gheorghe Comanici · David Abel · Doina Precup -
2019 Workshop: Workshop on Multi-Task and Lifelong Reinforcement Learning »
Sarath Chandar · Shagun Sodhani · Khimya Khetarpal · Tom Zahavy · Daniel J. Mankowitz · Shie Mannor · Balaraman Ravindran · Doina Precup · Chelsea Finn · Abhishek Gupta · Amy Zhang · Kyunghyun Cho · Andrei A Rusu · Facebook Rob Fergus