Timezone: »

 
The Reflective Explorer: Online Meta-Exploration from Offline Data in Visual Tasks with Sparse Rewards
Rafael Rafailov · Varun Kumar · Tianhe (Kevin) Yu · Avi Singh · mariano phielipp · Chelsea Finn

Reinforcement learning is difficult to apply to real world problems due to high sample complexity, the need to adapt to regular distribution shifts, often encountered in the real world, and the complexities of learning from high-dimensional inputs, such as images. Over the last several years meta-learning has emerged as a promising approach to tackle these problems by explicitly training an agent to quickly adapt to novel tasks. However, such methods still require huge amounts of data during training are are difficult to optimize in high-dimensional domains. One potential solution is to consider offline or batch meta-learning - learning from existing datasets without additional environment interactions during training. In this work we develop the first offline meta-learning algorithm that operates from images in tasks with sparse rewards. Our approach has three main components: a novel strategy to construct meta-exploration trajectories from offline data, a deep variational filter training and latent offline model-free policy optimization. We show that our method completely solves a realistic meta-learning task involving robot manipulation, while naive combinations of meta-learning and offline algorithms significantly under-perform.

Author Information

Rafael Rafailov (Stanford University)
Varun Kumar (Intel AI Lab)
Tianhe (Kevin) Yu (Stanford University)
Avi Singh (UC Berkeley)
mariano phielipp (Intel AI Labs)
Chelsea Finn (Stanford)

Chelsea Finn is an Assistant Professor in Computer Science and Electrical Engineering at Stanford University. Finn's research interests lie in the capability of robots and other agents to develop broadly intelligent behavior through learning and interaction. To this end, her work has included deep learning algorithms for concurrently learning visual perception and control in robotic manipulation skills, inverse reinforcement methods for learning reward functions underlying behavior, and meta-learning algorithms that can enable fast, few-shot adaptation in both visual perception and deep reinforcement learning. Finn received her Bachelor's degree in Electrical Engineering and Computer Science at MIT and her PhD in Computer Science at UC Berkeley. Her research has been recognized through the ACM doctoral dissertation award, the Microsoft Research Faculty Fellowship, the C.V. Ramamoorthy Distinguished Research Award, and the MIT Technology Review 35 under 35 Award, and her work has been covered by various media outlets, including the New York Times, Wired, and Bloomberg. Throughout her career, she has sought to increase the representation of underrepresented minorities within CS and AI by developing an AI outreach camp at Berkeley for underprivileged high school students, a mentoring program for underrepresented undergraduates across four universities, and leading efforts within the WiML and Berkeley WiCSE communities of women researchers.

More from the Same Authors