Timezone: »
We describe a framework for multitask deep reinforcement learning guided by policy sketches. Sketches annotate tasks with sequences of named subtasks, providing information about high-level structural relationships among tasks but not how to implement them---specifically not providing the detailed guidance used by much previous work on learning policy abstractions for RL (e.g. intermediate rewards, subtask completion signals, or intrinsic motivations). To learn from sketches, we present a model that associates every subtask with a modular subpolicy, and jointly maximizes reward over full task-specific policies by tying parameters across shared subpolicies. Optimization is accomplished via a decoupled actor--critic training objective that facilitates learning common behaviors from multiple dissimilar reward functions. We evaluate the effectiveness of our approach in three environments featuring both discrete and continuous control, and with sparse rewards that can be obtained only after completing a number of high-level subgoals. Experiments show that using our approach to learn policies guided by sketches gives better performance than existing techniques for learning task-specific or shared policies, while naturally inducing a library of interpretable primitive behaviors that can be recombined to rapidly adapt to new tasks.
Author Information
Jacob Andreas (UC Berkeley)
Dan Klein (UC Berkeley)
Sergey Levine (Berkeley)

Sergey Levine received a BS and MS in Computer Science from Stanford University in 2009, and a Ph.D. in Computer Science from Stanford University in 2014. He joined the faculty of the Department of Electrical Engineering and Computer Sciences at UC Berkeley in fall 2016. His work focuses on machine learning for decision making and control, with an emphasis on deep learning and reinforcement learning algorithms. Applications of his work include autonomous robots and vehicles, as well as computer vision and graphics. His research includes developing algorithms for end-to-end training of deep neural network policies that combine perception and control, scalable algorithms for inverse reinforcement learning, deep reinforcement learning algorithms, and more.
Related Events (a corresponding poster, oral, or spotlight)
-
2017 Poster: Modular Multitask Reinforcement Learning with Policy Sketches »
Tue. Aug 8th 08:30 AM -- 12:00 PM Room Gallery #27
More from the Same Authors
-
2021 : Why Generalization in RL is Difficult: Epistemic POMDPs and Implicit Partial Observability »
Dibya Ghosh · Jad Rahme · Aviral Kumar · Amy Zhang · Ryan P. Adams · Sergey Levine -
2021 : Value-Based Deep Reinforcement Learning Requires Explicit Regularization »
Aviral Kumar · Rishabh Agarwal · Aaron Courville · Tengyu Ma · George Tucker · Sergey Levine -
2021 : Multi-Task Offline Reinforcement Learning with Conservative Data Sharing »
Tianhe (Kevin) Yu · Aviral Kumar · Yevgen Chebotar · Karol Hausman · Sergey Levine · Chelsea Finn -
2021 : Reinforcement Learning as One Big Sequence Modeling Problem »
Michael Janner · Qiyang Li · Sergey Levine -
2021 : Learning Space Partitions for Path Planning »
Kevin Yang · Tianjun Zhang · Chris Cummins · Brandon Cui · Benoit Steiner · Linnan Wang · Joseph E Gonzalez · Dan Klein · Yuandong Tian -
2021 : ReLMM: Practical RL for Learning Mobile Manipulation Skills Using Only Onboard Sensors »
Charles Sun · Jedrzej Orbik · Coline Devin · Abhishek Gupta · Glen Berseth · Sergey Levine -
2021 : Multi-Task Offline Reinforcement Learning with Conservative Data Sharing »
Tianhe (Kevin) Yu · Aviral Kumar · Yevgen Chebotar · Karol Hausman · Sergey Levine · Chelsea Finn -
2021 : Value-Based Deep Reinforcement Learning Requires Explicit Regularization »
Aviral Kumar · Rishabh Agarwal · Aaron Courville · Tengyu Ma · George Tucker · Sergey Levine -
2021 : Value-Based Deep Reinforcement Learning Requires Explicit Regularization »
Aviral Kumar · Rishabh Agarwal · Aaron Courville · Tengyu Ma · George Tucker · Sergey Levine -
2022 : Distributionally Adaptive Meta Reinforcement Learning »
Anurag Ajay · Dibya Ghosh · Sergey Levine · Pulkit Agrawal · Abhishek Gupta -
2023 Poster: Poisoning Language Models During Instruction Tuning »
Alexander Wan · Eric Wallace · Sheng Shen · Dan Klein -
2023 Poster: Guiding Pretraining in Reinforcement Learning with Large Language Models »
Yuqing Du · Olivia Watkins · Zihan Wang · Cédric Colas · Trevor Darrell · Pieter Abbeel · Abhishek Gupta · Jacob Andreas -
2023 Poster: PromptBoosting: Black-Box Text Classification with Ten Forward Passes »
Bairu Hou · Joe O'Connor · Jacob Andreas · Shiyu Chang · Yang Zhang -
2022 : Q/A Sergey Levine »
Sergey Levine -
2022 : Invited Speaker: Sergey Levine »
Sergey Levine -
2022 Poster: Offline Meta-Reinforcement Learning with Online Self-Supervision »
Vitchyr Pong · Ashvin Nair · Laura Smith · Catherine Huang · Sergey Levine -
2022 Poster: Design-Bench: Benchmarks for Data-Driven Offline Model-Based Optimization »
Brandon Trabucco · Xinyang Geng · Aviral Kumar · Sergey Levine -
2022 Poster: How to Leverage Unlabeled Data in Offline Reinforcement Learning »
Tianhe (Kevin) Yu · Aviral Kumar · Yevgen Chebotar · Karol Hausman · Chelsea Finn · Sergey Levine -
2022 Spotlight: How to Leverage Unlabeled Data in Offline Reinforcement Learning »
Tianhe (Kevin) Yu · Aviral Kumar · Yevgen Chebotar · Karol Hausman · Chelsea Finn · Sergey Levine -
2022 Spotlight: Offline Meta-Reinforcement Learning with Online Self-Supervision »
Vitchyr Pong · Ashvin Nair · Laura Smith · Catherine Huang · Sergey Levine -
2022 Spotlight: Design-Bench: Benchmarks for Data-Driven Offline Model-Based Optimization »
Brandon Trabucco · Xinyang Geng · Aviral Kumar · Sergey Levine -
2022 Poster: Modeling Strong and Human-Like Gameplay with KL-Regularized Search »
Athul Paul Jacob · David Wu · Gabriele Farina · Adam Lerer · Hengyuan Hu · Anton Bakhtin · Jacob Andreas · Noam Brown -
2022 Poster: Planning with Diffusion for Flexible Behavior Synthesis »
Michael Janner · Yilun Du · Josh Tenenbaum · Sergey Levine -
2022 Oral: Planning with Diffusion for Flexible Behavior Synthesis »
Michael Janner · Yilun Du · Josh Tenenbaum · Sergey Levine -
2022 Spotlight: Modeling Strong and Human-Like Gameplay with KL-Regularized Search »
Athul Paul Jacob · David Wu · Gabriele Farina · Adam Lerer · Hengyuan Hu · Anton Bakhtin · Jacob Andreas · Noam Brown -
2022 Poster: Describing Differences between Text Distributions with Natural Language »
Ruiqi Zhong · Charlie Snell · Dan Klein · Jacob Steinhardt -
2022 Poster: Offline RL Policies Should Be Trained to be Adaptive »
Dibya Ghosh · Anurag Ajay · Pulkit Agrawal · Sergey Levine -
2022 Spotlight: Describing Differences between Text Distributions with Natural Language »
Ruiqi Zhong · Charlie Snell · Dan Klein · Jacob Steinhardt -
2022 Oral: Offline RL Policies Should Be Trained to be Adaptive »
Dibya Ghosh · Anurag Ajay · Pulkit Agrawal · Sergey Levine -
2021 : Value-Based Deep Reinforcement Learning Requires Explicit Regularization »
Aviral Kumar · Rishabh Agarwal · Aaron Courville · Tengyu Ma · George Tucker · Sergey Levine -
2021 Poster: Simple and Effective VAE Training with Calibrated Decoders »
Oleh Rybkin · Kostas Daniilidis · Sergey Levine -
2021 Poster: WILDS: A Benchmark of in-the-Wild Distribution Shifts »
Pang Wei Koh · Shiori Sagawa · Henrik Marklund · Sang Michael Xie · Marvin Zhang · Akshay Balsubramani · Weihua Hu · Michihiro Yasunaga · Richard Lanas Phillips · Irena Gao · Tony Lee · Etienne David · Ian Stavness · Wei Guo · Berton Earnshaw · Imran Haque · Sara Beery · Jure Leskovec · Anshul Kundaje · Emma Pierson · Sergey Levine · Chelsea Finn · Percy Liang -
2021 Poster: Calibrate Before Use: Improving Few-shot Performance of Language Models »
Tony Z. Zhao · Eric Wallace · Shi Feng · Dan Klein · Sameer Singh -
2021 Oral: WILDS: A Benchmark of in-the-Wild Distribution Shifts »
Pang Wei Koh · Shiori Sagawa · Henrik Marklund · Sang Michael Xie · Marvin Zhang · Akshay Balsubramani · Weihua Hu · Michihiro Yasunaga · Richard Lanas Phillips · Irena Gao · Tony Lee · Etienne David · Ian Stavness · Wei Guo · Berton Earnshaw · Imran Haque · Sara Beery · Jure Leskovec · Anshul Kundaje · Emma Pierson · Sergey Levine · Chelsea Finn · Percy Liang -
2021 Oral: Calibrate Before Use: Improving Few-shot Performance of Language Models »
Tony Z. Zhao · Eric Wallace · Shi Feng · Dan Klein · Sameer Singh -
2021 Spotlight: Simple and Effective VAE Training with Calibrated Decoders »
Oleh Rybkin · Kostas Daniilidis · Sergey Levine -
2021 Poster: Modularity in Reinforcement Learning via Algorithmic Independence in Credit Assignment »
Michael Chang · Sid Kaushik · Sergey Levine · Thomas Griffiths -
2021 Poster: Conservative Objective Models for Effective Offline Model-Based Optimization »
Brandon Trabucco · Aviral Kumar · Xinyang Geng · Sergey Levine -
2021 Spotlight: Conservative Objective Models for Effective Offline Model-Based Optimization »
Brandon Trabucco · Aviral Kumar · Xinyang Geng · Sergey Levine -
2021 Oral: Modularity in Reinforcement Learning via Algorithmic Independence in Credit Assignment »
Michael Chang · Sid Kaushik · Sergey Levine · Thomas Griffiths -
2021 Poster: Policy Information Capacity: Information-Theoretic Measure for Task Complexity in Deep Reinforcement Learning »
Hiroki Furuta · Tatsuya Matsushima · Tadashi Kozuno · Yutaka Matsuo · Sergey Levine · Ofir Nachum · Shixiang Gu -
2021 Poster: MURAL: Meta-Learning Uncertainty-Aware Rewards for Outcome-Driven Reinforcement Learning »
Kevin Li · Abhishek Gupta · Ashwin D Reddy · Vitchyr Pong · Aurick Zhou · Justin Yu · Sergey Levine -
2021 Poster: PsiPhi-Learning: Reinforcement Learning with Demonstrations using Successor Features and Inverse Temporal Difference Learning »
Angelos Filos · Clare Lyle · Yarin Gal · Sergey Levine · Natasha Jaques · Gregory Farquhar -
2021 Poster: Leveraging Language to Learn Program Abstractions and Search Heuristics »
Catherine Wong · Kevin Ellis · Josh Tenenbaum · Jacob Andreas -
2021 Spotlight: MURAL: Meta-Learning Uncertainty-Aware Rewards for Outcome-Driven Reinforcement Learning »
Kevin Li · Abhishek Gupta · Ashwin D Reddy · Vitchyr Pong · Aurick Zhou · Justin Yu · Sergey Levine -
2021 Spotlight: Policy Information Capacity: Information-Theoretic Measure for Task Complexity in Deep Reinforcement Learning »
Hiroki Furuta · Tatsuya Matsushima · Tadashi Kozuno · Yutaka Matsuo · Sergey Levine · Ofir Nachum · Shixiang Gu -
2021 Spotlight: Leveraging Language to Learn Program Abstractions and Search Heuristics »
Catherine Wong · Kevin Ellis · Josh Tenenbaum · Jacob Andreas -
2021 Oral: PsiPhi-Learning: Reinforcement Learning with Demonstrations using Successor Features and Inverse Temporal Difference Learning »
Angelos Filos · Clare Lyle · Yarin Gal · Sergey Levine · Natasha Jaques · Gregory Farquhar -
2021 Poster: Amortized Conditional Normalized Maximum Likelihood: Reliable Out of Distribution Uncertainty Estimation »
Aurick Zhou · Sergey Levine -
2021 Poster: Model-Based Reinforcement Learning via Latent-Space Collocation »
Oleh Rybkin · Chuning Zhu · Anusha Nagabandi · Kostas Daniilidis · Igor Mordatch · Sergey Levine -
2021 Spotlight: Model-Based Reinforcement Learning via Latent-Space Collocation »
Oleh Rybkin · Chuning Zhu · Anusha Nagabandi · Kostas Daniilidis · Igor Mordatch · Sergey Levine -
2021 Spotlight: Amortized Conditional Normalized Maximum Likelihood: Reliable Out of Distribution Uncertainty Estimation »
Aurick Zhou · Sergey Levine -
2020 : Invited Talk 9: Prof. Sergey Levine from UC Berkeley »
Sergey Levine -
2020 Workshop: 1st Workshop on Language in Reinforcement Learning (LaReL) »
Nantas Nardelli · Jelena Luketina · Nantas Nardelli · Jakob Foerster · Victor Zhong · Jacob Andreas · Tim Rocktäschel · Edward Grefenstette · Tim Rocktäschel -
2020 Poster: Decentralized Reinforcement Learning: Global Decision-Making via Local Economic Transactions »
Michael Chang · Sid Kaushik · S. Matthew Weinberg · Thomas Griffiths · Sergey Levine -
2020 Poster: Learning Human Objectives by Evaluating Hypothetical Behavior »
Siddharth Reddy · Anca Dragan · Sergey Levine · Shane Legg · Jan Leike -
2020 Poster: Skew-Fit: State-Covering Self-Supervised Reinforcement Learning »
Vitchyr Pong · Murtaza Dalal · Steven Lin · Ashvin Nair · Shikhar Bahl · Sergey Levine -
2020 Poster: Can Autonomous Vehicles Identify, Recover From, and Adapt to Distribution Shifts? »
Angelos Filos · Panagiotis Tigas · Rowan McAllister · Nicholas Rhinehart · Sergey Levine · Yarin Gal -
2020 Poster: Cautious Adaptation For Reinforcement Learning in Safety-Critical Settings »
Jesse Zhang · Brian Cheung · Chelsea Finn · Sergey Levine · Dinesh Jayaraman -
2020 Poster: Train Big, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers »
Zhuohan Li · Eric Wallace · Sheng Shen · Kevin Lin · Kurt Keutzer · Dan Klein · Joseph Gonzalez -
2019 : Jacob Andreas: Linguistic Scaffolds for Policy Learning »
Jacob Andreas -
2019 : Sergey Levine: "Imitation, Prediction, and Model-Based Reinforcement Learning for Autonomous Driving" »
Sergey Levine -
2019 : Sergey Levine: Unsupervised Reinforcement Learning and Meta-Learning »
Sergey Levine -
2019 Workshop: Adaptive and Multitask Learning: Algorithms & Systems »
Maruan Al-Shedivat · Anthony Platanios · Otilia Stretcu · Jacob Andreas · Ameet Talwalkar · Rich Caruana · Tom Mitchell · Eric Xing -
2019 Workshop: ICML Workshop on Imitation, Intent, and Interaction (I3) »
Nicholas Rhinehart · Sergey Levine · Chelsea Finn · He He · Ilya Kostrikov · Justin Fu · Siddharth Reddy -
2019 : Sergei Levine: Distribution Matching and Mutual Information in Reinforcement Learning »
Sergey Levine -
2019 Workshop: Generative Modeling and Model-Based Reasoning for Robotics and AI »
Aravind Rajeswaran · Emanuel Todorov · Igor Mordatch · William Agnew · Amy Zhang · Joelle Pineau · Michael Chang · Dumitru Erhan · Sergey Levine · Kimberly Stachenfeld · Marvin Zhang -
2019 Poster: Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context Variables »
Kate Rakelly · Aurick Zhou · Chelsea Finn · Sergey Levine · Deirdre Quillen -
2019 Poster: SOLAR: Deep Structured Representations for Model-Based Reinforcement Learning »
Marvin Zhang · Sharad Vikram · Laura Smith · Pieter Abbeel · Matthew Johnson · Sergey Levine -
2019 Oral: Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context Variables »
Kate Rakelly · Aurick Zhou · Chelsea Finn · Sergey Levine · Deirdre Quillen -
2019 Oral: SOLAR: Deep Structured Representations for Model-Based Reinforcement Learning »
Marvin Zhang · Sharad Vikram · Laura Smith · Pieter Abbeel · Matthew Johnson · Sergey Levine -
2019 Poster: Learning a Prior over Intent via Meta-Inverse Reinforcement Learning »
Kelvin Xu · Ellis Ratner · Anca Dragan · Sergey Levine · Chelsea Finn -
2019 Poster: EMI: Exploration with Mutual Information »
Hyoungseok Kim · Jaekyeom Kim · Yeonwoo Jeong · Sergey Levine · Hyun Oh Song -
2019 Poster: Online Meta-Learning »
Chelsea Finn · Aravind Rajeswaran · Sham Kakade · Sergey Levine -
2019 Poster: Diagnosing Bottlenecks in Deep Q-learning Algorithms »
Justin Fu · Aviral Kumar · Matthew Soh · Sergey Levine -
2019 Oral: Learning a Prior over Intent via Meta-Inverse Reinforcement Learning »
Kelvin Xu · Ellis Ratner · Anca Dragan · Sergey Levine · Chelsea Finn -
2019 Oral: EMI: Exploration with Mutual Information »
Hyoungseok Kim · Jaekyeom Kim · Yeonwoo Jeong · Sergey Levine · Hyun Oh Song -
2019 Oral: Diagnosing Bottlenecks in Deep Q-learning Algorithms »
Justin Fu · Aviral Kumar · Matthew Soh · Sergey Levine -
2019 Oral: Online Meta-Learning »
Chelsea Finn · Aravind Rajeswaran · Sham Kakade · Sergey Levine -
2019 Tutorial: Meta-Learning: from Few-Shot Learning to Rapid Reinforcement Learning »
Chelsea Finn · Sergey Levine -
2018 Poster: Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor »
Tuomas Haarnoja · Aurick Zhou · Pieter Abbeel · Sergey Levine -
2018 Poster: Can Deep Reinforcement Learning Solve Erdos-Selfridge-Spencer Games? »
Maithra Raghu · Alexander Irpan · Jacob Andreas · Bobby Kleinberg · Quoc Le · Jon Kleinberg -
2018 Poster: Regret Minimization for Partially Observable Deep Reinforcement Learning »
Peter Jin · EECS Kurt Keutzer · Sergey Levine -
2018 Poster: The Mirage of Action-Dependent Baselines in Reinforcement Learning »
George Tucker · Surya Bhupatiraju · Shixiang Gu · Richard E Turner · Zoubin Ghahramani · Sergey Levine -
2018 Oral: Regret Minimization for Partially Observable Deep Reinforcement Learning »
Peter Jin · EECS Kurt Keutzer · Sergey Levine -
2018 Oral: Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor »
Tuomas Haarnoja · Aurick Zhou · Pieter Abbeel · Sergey Levine -
2018 Oral: The Mirage of Action-Dependent Baselines in Reinforcement Learning »
George Tucker · Surya Bhupatiraju · Shixiang Gu · Richard E Turner · Zoubin Ghahramani · Sergey Levine -
2018 Oral: Can Deep Reinforcement Learning Solve Erdos-Selfridge-Spencer Games? »
Maithra Raghu · Alexander Irpan · Jacob Andreas · Bobby Kleinberg · Quoc Le · Jon Kleinberg -
2018 Poster: Latent Space Policies for Hierarchical Reinforcement Learning »
Tuomas Haarnoja · Kristian Hartikainen · Pieter Abbeel · Sergey Levine -
2018 Poster: Self-Consistent Trajectory Autoencoder: Hierarchical Reinforcement Learning with Trajectory Embeddings »
John Co-Reyes · Yu Xuan Liu · Abhishek Gupta · Benjamin Eysenbach · Pieter Abbeel · Sergey Levine -
2018 Poster: Universal Planning Networks: Learning Generalizable Representations for Visuomotor Control »
Aravind Srinivas · Allan Jabri · Pieter Abbeel · Sergey Levine · Chelsea Finn -
2018 Oral: Universal Planning Networks: Learning Generalizable Representations for Visuomotor Control »
Aravind Srinivas · Allan Jabri · Pieter Abbeel · Sergey Levine · Chelsea Finn -
2018 Oral: Self-Consistent Trajectory Autoencoder: Hierarchical Reinforcement Learning with Trajectory Embeddings »
John Co-Reyes · Yu Xuan Liu · Abhishek Gupta · Benjamin Eysenbach · Pieter Abbeel · Sergey Levine -
2018 Oral: Latent Space Policies for Hierarchical Reinforcement Learning »
Tuomas Haarnoja · Kristian Hartikainen · Pieter Abbeel · Sergey Levine -
2017 : Lifelong Learning - Panel Discussion »
Sergey Levine · Joelle Pineau · Balaraman Ravindran · Andrei A Rusu -
2017 : Sergey Levine: Self-supervision as a path to lifelong learning »
Sergey Levine -
2017 Poster: Combining Model-Based and Model-Free Updates for Trajectory-Centric Reinforcement Learning »
Yevgen Chebotar · Karol Hausman · Marvin Zhang · Gaurav Sukhatme · Stefan Schaal · Sergey Levine -
2017 Talk: Combining Model-Based and Model-Free Updates for Trajectory-Centric Reinforcement Learning »
Yevgen Chebotar · Karol Hausman · Marvin Zhang · Gaurav Sukhatme · Stefan Schaal · Sergey Levine -
2017 Poster: Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks »
Chelsea Finn · Pieter Abbeel · Sergey Levine -
2017 Poster: Reinforcement Learning with Deep Energy-Based Policies »
Tuomas Haarnoja · Haoran Tang · Pieter Abbeel · Sergey Levine -
2017 Talk: Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks »
Chelsea Finn · Pieter Abbeel · Sergey Levine -
2017 Talk: Reinforcement Learning with Deep Energy-Based Policies »
Tuomas Haarnoja · Haoran Tang · Pieter Abbeel · Sergey Levine -
2017 Tutorial: Deep Reinforcement Learning, Decision Making, and Control »
Sergey Levine · Chelsea Finn