Timezone: »
In many real-world scenarios, rewards extrinsic to the agent are extremely sparse, or absent altogether. In such cases, curiosity can serve as an intrinsic reward signal to enable the agent to explore its environment and learn skills that might be useful later in its life. We formulate curiosity as the error in an agent's ability to predict the consequence of its own actions in a visual feature space learned by a self-supervised inverse dynamics model. Our formulation scales to high-dimensional continuous state spaces like images, bypasses the difficulties of directly predicting pixels, and, critically, ignores the aspects of the environment that cannot affect the agent. The proposed approach is evaluated in two environments: VizDoom and Super Mario Bros. Three broad settings are investigated: 1) sparse extrinsic reward, where curiosity allows for far fewer interactions with the environment to reach the goal; 2) exploration with no extrinsic reward, where curiosity pushes the agent to explore more efficiently; and 3) generalization to unseen scenarios (e.g. new levels of the same game) where the knowledge gained from earlier experience helps the agent explore new places much faster than starting from scratch.
Author Information
Deepak Pathak (UC Berkeley)
Pulkit Agrawal (MIT)
Alexei Efros (UC Berkeley)
Trevor Darrell (University of California at Berkeley)
Related Events (a corresponding poster, oral, or spotlight)
-
2017 Poster: Curiosity-driven Exploration by Self-supervised Prediction »
Mon. Aug 7th 08:30 AM -- 12:00 PM Room Gallery #88
More from the Same Authors
-
2021 : Explaining Reinforcement Learning Policies through Counterfactual Trajectories »
Julius Frost · Olivia Watkins · Eric Weiner · Pieter Abbeel · Trevor Darrell · Bryan Plummer · Kate Saenko -
2023 : Internet Explorer: Targeted Representation Learning on the Open Web »
Alexander Li · Ellis Brown · Alexei Efros · Deepak Pathak -
2023 : LLM-grounded Text-to-Image Diffusion Models »
Long (Tony) Lian · Boyi Li · Adam Yala · Trevor Darrell -
2023 Poster: Internet Explorer: Targeted Representation Learning on the Open Web »
Alexander Li · Ellis Brown · Alexei Efros · Deepak Pathak -
2022 : Panel discussion »
Steffen Schneider · Aleksander Madry · Alexei Efros · Chelsea Finn · Soheil Feizi -
2022 : Back to the Source: Test-Time Diffusion-Driven Adaptation »
Jin Gao · Jialing Zhang · Xihui Liu · Trevor Darrell · Evan Shelhamer · Dequan Wang -
2022 : Invited Talk 4: Alexei Efros »
Alexei Efros -
2022 Poster: Visual Attention Emerges from Recurrent Sparse Reconstruction »
Baifeng Shi · Yale Song · Neel Joshi · Trevor Darrell · Xin Wang -
2022 Spotlight: Visual Attention Emerges from Recurrent Sparse Reconstruction »
Baifeng Shi · Yale Song · Neel Joshi · Trevor Darrell · Xin Wang -
2022 Poster: Zero-Shot Reward Specification via Grounded Natural Language »
Parsa Mahmoudieh · Deepak Pathak · Trevor Darrell -
2022 Spotlight: Zero-Shot Reward Specification via Grounded Natural Language »
Parsa Mahmoudieh · Deepak Pathak · Trevor Darrell -
2021 Workshop: ICML Workshop on Human in the Loop Learning (HILL) »
Trevor Darrell · Xin Wang · Li Erran Li · Fisher Yu · Zeynep Akata · Wenwu Zhu · Pradeep Ravikumar · Shiji Zhou · Shanghang Zhang · Kalesha Bullard -
2021 Poster: Compositional Video Synthesis with Action Graphs »
Amir Bar · Roi Herzig · Xiaolong Wang · Anna Rohrbach · Gal Chechik · Trevor Darrell · Amir Globerson -
2021 Spotlight: Compositional Video Synthesis with Action Graphs »
Amir Bar · Roi Herzig · Xiaolong Wang · Anna Rohrbach · Gal Chechik · Trevor Darrell · Amir Globerson -
2020 Workshop: 2nd ICML Workshop on Human in the Loop Learning (HILL) »
Shanghang Zhang · Xin Wang · Fisher Yu · Jiajun Wu · Trevor Darrell -
2020 : Live Invited Talk: Alexi Efros "Imagining a Post-Dataset Era" »
Alexei Efros -
2020 Poster: Video Prediction via Example Guidance »
Jingwei Xu · Harry (Huazhe) Xu · Bingbing Ni · Xiaokang Yang · Trevor Darrell -
2020 Poster: Test-Time Training with Self-Supervision for Generalization under Distribution Shifts »
Yu Sun · Xiaolong Wang · Zhuang Liu · John Miller · Alexei Efros · Moritz Hardt -
2020 Poster: Frustratingly Simple Few-Shot Object Detection »
Xin Wang · Thomas Huang · Joseph E Gonzalez · Trevor Darrell · Fisher Yu -
2019 : Fisher Yu: "Motion and Prediction for Autonomous Driving" »
Fisher Yu · Trevor Darrell -
2019 : Invited Talk by Professor Alexei Efros (UC Berkeley) »
Alexei Efros -
2019 Poster: Self-Supervised Exploration via Disagreement »
Deepak Pathak · Dhiraj Gandhi · Abhinav Gupta -
2019 Oral: Self-Supervised Exploration via Disagreement »
Deepak Pathak · Dhiraj Gandhi · Abhinav Gupta -
2018 Poster: CyCADA: Cycle-Consistent Adversarial Domain Adaptation »
Judy Hoffman · Eric Tzeng · Taesung Park · Jun-Yan Zhu · Philip Isola · Kate Saenko · Alexei Efros · Trevor Darrell -
2018 Oral: CyCADA: Cycle-Consistent Adversarial Domain Adaptation »
Judy Hoffman · Eric Tzeng · Taesung Park · Jun-Yan Zhu · Philip Isola · Kate Saenko · Alexei Efros · Trevor Darrell -
2018 Poster: Investigating Human Priors for Playing Video Games »
Rachit Dubey · Pulkit Agrawal · Deepak Pathak · Tom Griffiths · Alexei Efros -
2018 Oral: Investigating Human Priors for Playing Video Games »
Rachit Dubey · Pulkit Agrawal · Deepak Pathak · Tom Griffiths · Alexei Efros