Timezone: »
Poster
Multi-Environment Pretraining Enables Transfer to Action Limited Datasets
David Venuto · Mengjiao Yang · Pieter Abbeel · Doina Precup · Igor Mordatch · Ofir Nachum
Using massive datasets to train large-scale models has emerged as a dominant approach for broad generalization in natural language and vision applications. In reinforcement learning, however, a key challenge is that available data of sequential decision making is often not annotated with actions - for example, videos of game-play are much more available than sequences of frames paired with their logged game controls. We propose to circumvent this challenge by combining large but sparsely-annotated datasets from a *target* environment of interest with fully-annotated datasets from various other *source* environments. Our method, Action Limited PreTraining (ALPT), leverages the generalization capabilities of inverse dynamics modelling (IDM) to label missing action data in the target environment. We show that utilizing even one additional environment dataset of labelled data during IDM pretraining gives rise to substantial improvements in generating action labels for unannotated sequences. We evaluate our method on benchmark game-playing environments and show that we can significantly improve game performance and generalization capability compared to other approaches, using annotated datasets equivalent to only $12$ minutes of gameplay. Highlighting the power of IDM, we show that these benefits remain even when target and source environments share no common actions.
Author Information
David Venuto (Mila, McGill)
Mengjiao Yang (Google Brain, UC Berkeley)
Pieter Abbeel (UC Berkeley & Covariant)
Doina Precup (McGill University / DeepMind)
Igor Mordatch (Research, Google)
Ofir Nachum (Google Brain)
More from the Same Authors
-
2021 : Randomized Least Squares Policy Optimization »
Haque Ishfaq · Zhuoran Yang · Andrei Lupu · Viet Nguyen · Lewis Liu · Riashat Islam · Zhaoran Wang · Doina Precup -
2021 : Finite time analysis of temporal difference learning with linear function approximation: the tail averaged case »
Gandharv Patil · Prashanth L.A. · Doina Precup -
2021 : SparseDice: Imitation Learning for Temporally Sparse Data via Regularization »
Alberto Camacho · Izzeddin Gur · Marcin Moczulski · Ofir Nachum · Aleksandra Faust -
2021 : Decision Transformer: Reinforcement Learning via Sequence Modeling »
Lili Chen · Kevin Lu · Aravind Rajeswaran · Kimin Lee · Aditya Grover · Michael Laskin · Pieter Abbeel · Aravind Srinivas · Igor Mordatch -
2021 : Data-Efficient Exploration with Self Play for Atari »
Michael Laskin · Catherine Cang · Ryan Rudes · Pieter Abbeel -
2021 : Hierarchical Few-Shot Imitation with Skill Transition Models »
kourosh hakhamaneshi · Ruihan Zhao · Albert Zhan · Pieter Abbeel · Michael Laskin -
2021 : Understanding the Generalization Gap in Visual Reinforcement Learning »
Anurag Ajay · Ge Yang · Ofir Nachum · Pulkit Agrawal -
2021 : Decision Transformer: Reinforcement Learning via Sequence Modeling »
Lili Chen · Kevin Lu · Aravind Rajeswaran · Kimin Lee · Aditya Grover · Michael Laskin · Pieter Abbeel · Aravind Srinivas · Igor Mordatch -
2021 : Explaining Reinforcement Learning Policies through Counterfactual Trajectories »
Julius Frost · Olivia Watkins · Eric Weiner · Pieter Abbeel · Trevor Darrell · Bryan Plummer · Kate Saenko -
2022 : Multimodal Masked Autoencoders Learn Transferable Representations »
Xinyang Geng · Hao Liu · Lisa Lee · Dale Schuurmans · Sergey Levine · Pieter Abbeel -
2023 : On learning history-based policies for controlling Markov decision processes »
Gandharv Patil · Aditya Mahajan · Doina Precup -
2023 : In-Context Decision-Making from Supervised Pretraining »
Jonathan Lee · Annie Xie · Aldo Pacchiano · Yash Chandak · Chelsea Finn · Ofir Nachum · Emma Brunskill -
2023 : An Empirical Study of the Effectiveness of Using a Replay Buffer on Mode Discovery in GFlowNets »
Nikhil Murali Vemgal · Elaine Lau · Doina Precup -
2023 : Blockwise Parallel Transformer for Long Context Large Models »
Hao Liu · Pieter Abbeel -
2023 : Accelerating exploration and representation learning with offline pre-training »
Bogdan Mazoure · Jake Bruce · Doina Precup · Rob Fergus · Ankit Anand -
2023 Poster: Masked Trajectory Models for Prediction, Representation, and Control »
Philipp Wu · Arjun Majumdar · Kevin Stone · Yixin Lin · Igor Mordatch · Pieter Abbeel · Aravind Rajeswaran -
2023 Poster: Guiding Pretraining in Reinforcement Learning with Large Language Models »
Yuqing Du · Olivia Watkins · Zihan Wang · Cédric Colas · Trevor Darrell · Pieter Abbeel · Abhishek Gupta · Jacob Andreas -
2023 Poster: Controllability-Aware Unsupervised Skill Discovery »
Seohong Park · Kimin Lee · Youngwoon Lee · Pieter Abbeel -
2023 Poster: Emergent Agentic Transformer from Chain of Hindsight Experience »
Hao Liu · Pieter Abbeel -
2023 Poster: Temporally Consistent Transformers for Video Generation »
Wilson Yan · Danijar Hafner · Stephen James · Pieter Abbeel -
2023 Poster: PaLM-E: An Embodied Multimodal Language Model »
Danny Driess · Fei Xia · Mehdi S. M. Sajjadi · Corey Lynch · Aakanksha Chowdhery · Brian Ichter · Ayzaan Wahid · Jonathan Tompson · Quan Vuong · Tianhe (Kevin) Yu · Wenlong Huang · Yevgen Chebotar · Pierre Sermanet · Daniel Duckworth · Sergey Levine · Vincent Vanhoucke · Karol Hausman · Marc Toussaint · Klaus Greff · Andy Zeng · Igor Mordatch · Pete Florence -
2023 Poster: CLUTR: Curriculum Learning via Unsupervised Task Representation Learning »
Abdus Salam Azad · Izzeddin Gur · Jasper Emhoff · Nathaniel Alexis · Aleksandra Faust · Pieter Abbeel · Ion Stoica -
2023 Poster: Multi-View Masked World Models for Visual Robotic Manipulation »
Younggyo Seo · Junsu Kim · Stephen James · Kimin Lee · Jinwoo Shin · Pieter Abbeel -
2023 Poster: The Wisdom of Hindsight Makes Language Models Better Instruction Followers »
Tianjun Zhang · Fangchen Liu · Justin Wong · Pieter Abbeel · Joseph E Gonzalez -
2022 : Multimodal Masked Autoencoders Learn Transferable Representations »
Xinyang Geng · Hao Liu · Lisa Lee · Dale Schuurmans · Sergey Levine · Pieter Abbeel -
2022 Workshop: Decision Awareness in Reinforcement Learning »
Evgenii Nikishin · Pierluca D'Oro · Doina Precup · Andre Barreto · Amir-massoud Farahmand · Pierre-Luc Bacon -
2022 Poster: Why Should I Trust You, Bellman? The Bellman Error is a Poor Replacement for Value Error »
Scott Fujimoto · David Meger · Doina Precup · Ofir Nachum · Shixiang Gu -
2022 Poster: Model Selection in Batch Policy Optimization »
Jonathan Lee · George Tucker · Ofir Nachum · Bo Dai -
2022 Poster: Making Linear MDPs Practical via Contrastive Representation Learning »
Tianjun Zhang · Tongzheng Ren · Mengjiao Yang · Joseph E Gonzalez · Dale Schuurmans · Bo Dai -
2022 Poster: Reducing Variance in Temporal-Difference Value Estimation via Ensemble of Deep Networks »
Litian Liang · Yaosheng Xu · Stephen Mcaleer · Dailin Hu · Alexander Ihler · Pieter Abbeel · Roy Fox -
2022 Poster: Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents »
Wenlong Huang · Pieter Abbeel · Deepak Pathak · Igor Mordatch -
2022 Spotlight: Making Linear MDPs Practical via Contrastive Representation Learning »
Tianjun Zhang · Tongzheng Ren · Mengjiao Yang · Joseph E Gonzalez · Dale Schuurmans · Bo Dai -
2022 Spotlight: Why Should I Trust You, Bellman? The Bellman Error is a Poor Replacement for Value Error »
Scott Fujimoto · David Meger · Doina Precup · Ofir Nachum · Shixiang Gu -
2022 Spotlight: Reducing Variance in Temporal-Difference Value Estimation via Ensemble of Deep Networks »
Litian Liang · Yaosheng Xu · Stephen Mcaleer · Dailin Hu · Alexander Ihler · Pieter Abbeel · Roy Fox -
2022 Spotlight: Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents »
Wenlong Huang · Pieter Abbeel · Deepak Pathak · Igor Mordatch -
2022 Spotlight: Model Selection in Batch Policy Optimization »
Jonathan Lee · George Tucker · Ofir Nachum · Bo Dai -
2022 Poster: Improving Robustness against Real-World and Worst-Case Distribution Shifts through Decision Region Quantification »
Leo Schwinn · Leon Bungert · An Nguyen · René Raab · Falk Pulsmeyer · Doina Precup · Bjoern Eskofier · Dario Zanca -
2022 Poster: Reinforcement Learning with Action-Free Pre-Training from Videos »
Younggyo Seo · Kimin Lee · Stephen James · Pieter Abbeel -
2022 Spotlight: Reinforcement Learning with Action-Free Pre-Training from Videos »
Younggyo Seo · Kimin Lee · Stephen James · Pieter Abbeel -
2022 Spotlight: Improving Robustness against Real-World and Worst-Case Distribution Shifts through Decision Region Quantification »
Leo Schwinn · Leon Bungert · An Nguyen · René Raab · Falk Pulsmeyer · Doina Precup · Bjoern Eskofier · Dario Zanca -
2022 Poster: Marginal Distribution Adaptation for Discrete Sets via Module-Oriented Divergence Minimization »
Hanjun Dai · Mengjiao Yang · Yuan Xue · Dale Schuurmans · Bo Dai -
2022 Spotlight: Marginal Distribution Adaptation for Discrete Sets via Module-Oriented Divergence Minimization »
Hanjun Dai · Mengjiao Yang · Yuan Xue · Dale Schuurmans · Bo Dai -
2021 : Panel Discussion »
Rosemary Nan Ke · Danijar Hafner · Pieter Abbeel · Chelsea Finn · Chelsea Finn -
2021 : Invited Talk by Pieter Abbeel »
Pieter Abbeel -
2021 Poster: Decoupling Representation Learning from Reinforcement Learning »
Adam Stooke · Kimin Lee · Pieter Abbeel · Michael Laskin -
2021 Spotlight: Decoupling Representation Learning from Reinforcement Learning »
Adam Stooke · Kimin Lee · Pieter Abbeel · Michael Laskin -
2021 Poster: Randomized Exploration in Reinforcement Learning with General Value Function Approximation »
Haque Ishfaq · Qiwen Cui · Viet Nguyen · Alex Ayoub · Zhuoran Yang · Zhaoran Wang · Doina Precup · Lin Yang -
2021 Spotlight: Randomized Exploration in Reinforcement Learning with General Value Function Approximation »
Haque Ishfaq · Qiwen Cui · Viet Nguyen · Alex Ayoub · Zhuoran Yang · Zhaoran Wang · Doina Precup · Lin Yang -
2021 Poster: APS: Active Pretraining with Successor Features »
Hao Liu · Pieter Abbeel -
2021 Poster: SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep Reinforcement Learning »
Kimin Lee · Michael Laskin · Aravind Srinivas · Pieter Abbeel -
2021 Spotlight: SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep Reinforcement Learning »
Kimin Lee · Michael Laskin · Aravind Srinivas · Pieter Abbeel -
2021 Oral: APS: Active Pretraining with Successor Features »
Hao Liu · Pieter Abbeel -
2021 Poster: PEBBLE: Feedback-Efficient Interactive Reinforcement Learning via Relabeling Experience and Unsupervised Pre-training »
Kimin Lee · Laura Smith · Pieter Abbeel -
2021 Poster: Policy Information Capacity: Information-Theoretic Measure for Task Complexity in Deep Reinforcement Learning »
Hiroki Furuta · Tatsuya Matsushima · Tadashi Kozuno · Yutaka Matsuo · Sergey Levine · Ofir Nachum · Shixiang Gu -
2021 Poster: Offline Reinforcement Learning with Fisher Divergence Critic Regularization »
Ilya Kostrikov · Rob Fergus · Jonathan Tompson · Ofir Nachum -
2021 Poster: Locally Persistent Exploration in Continuous Control Tasks with Sparse Rewards »
Susan Amin · Maziar Gomrokchi · Hossein Aboutalebi · Harsh Satija · Doina Precup -
2021 Poster: A Deep Reinforcement Learning Approach to Marginalized Importance Sampling with the Successor Representation »
Scott Fujimoto · David Meger · Doina Precup -
2021 Poster: Representation Matters: Offline Pretraining for Sequential Decision Making »
Mengjiao Yang · Ofir Nachum -
2021 Spotlight: Representation Matters: Offline Pretraining for Sequential Decision Making »
Mengjiao Yang · Ofir Nachum -
2021 Spotlight: Policy Information Capacity: Information-Theoretic Measure for Task Complexity in Deep Reinforcement Learning »
Hiroki Furuta · Tatsuya Matsushima · Tadashi Kozuno · Yutaka Matsuo · Sergey Levine · Ofir Nachum · Shixiang Gu -
2021 Spotlight: A Deep Reinforcement Learning Approach to Marginalized Importance Sampling with the Successor Representation »
Scott Fujimoto · David Meger · Doina Precup -
2021 Oral: PEBBLE: Feedback-Efficient Interactive Reinforcement Learning via Relabeling Experience and Unsupervised Pre-training »
Kimin Lee · Laura Smith · Pieter Abbeel -
2021 Spotlight: Locally Persistent Exploration in Continuous Control Tasks with Sparse Rewards »
Susan Amin · Maziar Gomrokchi · Hossein Aboutalebi · Harsh Satija · Doina Precup -
2021 Spotlight: Offline Reinforcement Learning with Fisher Divergence Critic Regularization »
Ilya Kostrikov · Rob Fergus · Jonathan Tompson · Ofir Nachum -
2021 Poster: Unsupervised Learning of Visual 3D Keypoints for Control »
Boyuan Chen · Pieter Abbeel · Deepak Pathak -
2021 Poster: State Entropy Maximization with Random Encoders for Efficient Exploration »
Younggyo Seo · Lili Chen · Jinwoo Shin · Honglak Lee · Pieter Abbeel · Kimin Lee -
2021 Poster: MSA Transformer »
Roshan Rao · Jason Liu · Robert Verkuil · Joshua Meier · John Canny · Pieter Abbeel · Tom Sercu · Alexander Rives -
2021 Poster: Preferential Temporal Difference Learning »
Nishanth Anand · Doina Precup -
2021 Spotlight: MSA Transformer »
Roshan Rao · Jason Liu · Robert Verkuil · Joshua Meier · John Canny · Pieter Abbeel · Tom Sercu · Alexander Rives -
2021 Spotlight: State Entropy Maximization with Random Encoders for Efficient Exploration »
Younggyo Seo · Lili Chen · Jinwoo Shin · Honglak Lee · Pieter Abbeel · Kimin Lee -
2021 Spotlight: Unsupervised Learning of Visual 3D Keypoints for Control »
Boyuan Chen · Pieter Abbeel · Deepak Pathak -
2021 Spotlight: Preferential Temporal Difference Learning »
Nishanth Anand · Doina Precup -
2021 : Part 2: Unsupervised Pre-Training in RL »
Pieter Abbeel -
2021 Tutorial: Unsupervised Learning for Reinforcement Learning »
Aravind Srinivas · Pieter Abbeel -
2020 : Panel Discussion »
Eric Eaton · Martha White · Doina Precup · Irina Rish · Harm van Seijen -
2020 Workshop: 4th Lifelong Learning Workshop »
Shagun Sodhani · Sarath Chandar · Balaraman Ravindran · Doina Precup -
2020 Poster: Energy-Based Processes for Exchangeable Data »
Mengjiao Yang · Bo Dai · Hanjun Dai · Dale Schuurmans -
2020 Poster: CURL: Contrastive Unsupervised Representations for Reinforcement Learning »
Michael Laskin · Aravind Srinivas · Pieter Abbeel -
2020 Poster: Interference and Generalization in Temporal Difference Learning »
Emmanuel Bengio · Joelle Pineau · Doina Precup -
2020 Poster: Hallucinative Topological Memory for Zero-Shot Visual Planning »
Kara Liu · Thanard Kurutach · Christine Tung · Pieter Abbeel · Aviv Tamar -
2020 Poster: Planning to Explore via Self-Supervised World Models »
Ramanan Sekar · Oleh Rybkin · Kostas Daniilidis · Pieter Abbeel · Danijar Hafner · Deepak Pathak -
2020 Poster: Invariant Causal Prediction for Block MDPs »
Amy Zhang · Clare Lyle · Shagun Sodhani · Angelos Filos · Marta Kwiatkowska · Joelle Pineau · Yarin Gal · Doina Precup -
2020 Poster: Responsive Safety in Reinforcement Learning by PID Lagrangian Methods »
Adam Stooke · Joshua Achiam · Pieter Abbeel -
2020 Poster: Variable Skipping for Autoregressive Range Density Estimation »
Eric Liang · Zongheng Yang · Ion Stoica · Pieter Abbeel · Yan Duan · Peter Chen -
2020 Poster: Hierarchically Decoupled Imitation For Morphological Transfer »
Donald Hejna · Lerrel Pinto · Pieter Abbeel -
2020 : Mentoring Panel: Doina Precup, Deborah Raji, Anima Anandkumar, Angjoo Kanazawa and Sinead Williamson (moderator). »
Doina Precup · Inioluwa Raji · Angjoo Kanazawa · Sinead A Williamson · Animashree Anandkumar -
2020 : Invited Talk: Doina Precup on Building Knowledge for AI Agents with Reinforcement Learning »
Doina Precup -
2019 Workshop: Workshop on Self-Supervised Learning »
Aaron van den Oord · Yusuf Aytar · Carl Doersch · Carl Vondrick · Alec Radford · Pierre Sermanet · Amir Zamir · Pieter Abbeel -
2019 Workshop: Workshop on Multi-Task and Lifelong Reinforcement Learning »
Sarath Chandar · Shagun Sodhani · Khimya Khetarpal · Tom Zahavy · Daniel J. Mankowitz · Shie Mannor · Balaraman Ravindran · Doina Precup · Chelsea Finn · Abhishek Gupta · Amy Zhang · Kyunghyun Cho · Andrei A Rusu · Facebook Rob Fergus -
2019 : Networking Lunch (provided) + Poster Session »
Abraham Stanway · Alex Robson · Aneesh Rangnekar · Ashesh Chattopadhyay · Ashley Pilipiszyn · Benjamin LeRoy · Bolong Cheng · Ce Zhang · Chaopeng Shen · Christian Schroeder · Christian Clough · Clement DUHART · Clement Fung · Cozmin Ududec · Dali Wang · David Dao · di wu · Dimitrios Giannakis · Dino Sejdinovic · Doina Precup · Duncan Watson-Parris · Gege Wen · George Chen · Gopal Erinjippurath · Haifeng Li · Han Zou · Herke van Hoof · Hillary A Scannell · Hiroshi Mamitsuka · Hongbao Zhang · Jaegul Choo · James Wang · James Requeima · Jessica Hwang · Jinfan Xu · Johan Mathe · Jonathan Binas · Joonseok Lee · Kalai Ramea · Kate Duffy · Kevin McCloskey · Kris Sankaran · Lester Mackey · Letif Mones · Loubna Benabbou · Lynn Kaack · Matthew Hoffman · Mayur Mudigonda · Mehrdad Mahdavi · Michael McCourt · Mingchao Jiang · Mohammad Mahdi Kamani · Neel Guha · Niccolo Dalmasso · Nick Pawlowski · Nikola Milojevic-Dupont · Paulo Orenstein · Pedram Hassanzadeh · Pekka Marttinen · Ramesh Nair · Sadegh Farhang · Samuel Kaski · Sandeep Manjanna · Sasha Luccioni · Shuby Deshpande · Soo Kim · Soukayna Mouatadid · Sunghyun Park · Tao Lin · Telmo Felgueira · Thomas Hornigold · Tianle Yuan · Tom Beucler · Tracy Cui · Volodymyr Kuleshov · Wei Yu · yang song · Ydo Wexler · Yoshua Bengio · Zhecheng Wang · Zhuangfang Yi · Zouheir Malki -
2019 : posters »
Zhengxing Chen · Juan Jose Garau Luis · Ignacio Albert Smet · Aditya Modi · Sabina Tomkins · Riley Simmons-Edler · Hongzi Mao · Alexander Irpan · Hao Lu · Rose Wang · Subhojyoti Mukherjee · Aniruddh Raghu · Syed Arbab Mohd Shihab · Byung Hoon Ahn · Rasool Fakoor · Pratik Chaudhari · Elena Smirnova · Min-hwan Oh · Xiaocheng Tang · Tony Qin · Qingyang Li · Marc Brittain · Ian Fox · Supratik Paul · Xiaofeng Gao · Yinlam Chow · Gabriel Dulac-Arnold · Ofir Nachum · Nikos Karampatziakis · Bharathan Balaji · Supratik Paul · Ali Davody · Djallel Bouneffouf · Himanshu Sahni · Soo Kim · Andrey Kolobov · Alexander Amini · Yao Liu · Xinshi Chen · · Craig Boutilier -
2019 Poster: Bit-Swap: Recursive Bits-Back Coding for Lossless Compression with Hierarchical Latent Variables »
Friso Kingma · Pieter Abbeel · Jonathan Ho -
2019 Poster: On the Feasibility of Learning, Rather than Assuming, Human Biases for Reward Inference »
Rohin Shah · Noah Gundotra · Pieter Abbeel · Anca Dragan -
2019 Oral: On the Feasibility of Learning, Rather than Assuming, Human Biases for Reward Inference »
Rohin Shah · Noah Gundotra · Pieter Abbeel · Anca Dragan -
2019 Oral: Bit-Swap: Recursive Bits-Back Coding for Lossless Compression with Hierarchical Latent Variables »
Friso Kingma · Pieter Abbeel · Jonathan Ho -
2019 Poster: Population Based Augmentation: Efficient Learning of Augmentation Policy Schedules »
Daniel Ho · Eric Liang · Peter Chen · Ion Stoica · Pieter Abbeel -
2019 Poster: Flow++: Improving Flow-Based Generative Models with Variational Dequantization and Architecture Design »
Jonathan Ho · Peter Chen · Aravind Srinivas · Rocky Duan · Pieter Abbeel -
2019 Poster: SOLAR: Deep Structured Representations for Model-Based Reinforcement Learning »
Marvin Zhang · Sharad Vikram · Laura Smith · Pieter Abbeel · Matthew Johnson · Sergey Levine -
2019 Oral: Flow++: Improving Flow-Based Generative Models with Variational Dequantization and Architecture Design »
Jonathan Ho · Peter Chen · Aravind Srinivas · Rocky Duan · Pieter Abbeel -
2019 Oral: Population Based Augmentation: Efficient Learning of Augmentation Policy Schedules »
Daniel Ho · Eric Liang · Peter Chen · Ion Stoica · Pieter Abbeel -
2019 Oral: SOLAR: Deep Structured Representations for Model-Based Reinforcement Learning »
Marvin Zhang · Sharad Vikram · Laura Smith · Pieter Abbeel · Matthew Johnson · Sergey Levine -
2019 Poster: Off-Policy Deep Reinforcement Learning without Exploration »
Scott Fujimoto · David Meger · Doina Precup -
2019 Poster: DeepMDP: Learning Continuous Latent Space Models for Representation Learning »
Carles Gelada · Saurabh Kumar · Jacob Buckman · Ofir Nachum · Marc Bellemare -
2019 Oral: DeepMDP: Learning Continuous Latent Space Models for Representation Learning »
Carles Gelada · Saurabh Kumar · Jacob Buckman · Ofir Nachum · Marc Bellemare -
2019 Oral: Off-Policy Deep Reinforcement Learning without Exploration »
Scott Fujimoto · David Meger · Doina Precup -
2018 Poster: Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor »
Tuomas Haarnoja · Aurick Zhou · Pieter Abbeel · Sergey Levine -
2018 Poster: Smoothed Action Value Functions for Learning Gaussian Policies »
Ofir Nachum · Mohammad Norouzi · George Tucker · Dale Schuurmans -
2018 Poster: PixelSNAIL: An Improved Autoregressive Generative Model »
Xi Chen · Nikhil Mishra · Mostafa Rohaninejad · Pieter Abbeel -
2018 Poster: Convergent Tree Backup and Retrace with Function Approximation »
Ahmed Touati · Pierre-Luc Bacon · Doina Precup · Pascal Vincent -
2018 Oral: Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor »
Tuomas Haarnoja · Aurick Zhou · Pieter Abbeel · Sergey Levine -
2018 Oral: PixelSNAIL: An Improved Autoregressive Generative Model »
Xi Chen · Nikhil Mishra · Mostafa Rohaninejad · Pieter Abbeel -
2018 Oral: Smoothed Action Value Functions for Learning Gaussian Policies »
Ofir Nachum · Mohammad Norouzi · George Tucker · Dale Schuurmans -
2018 Oral: Convergent Tree Backup and Retrace with Function Approximation »
Ahmed Touati · Pierre-Luc Bacon · Doina Precup · Pascal Vincent -
2018 Poster: Automatic Goal Generation for Reinforcement Learning Agents »
Carlos Florensa · David Held · Xinyang Geng · Pieter Abbeel -
2018 Poster: Latent Space Policies for Hierarchical Reinforcement Learning »
Tuomas Haarnoja · Kristian Hartikainen · Pieter Abbeel · Sergey Levine -
2018 Poster: Self-Consistent Trajectory Autoencoder: Hierarchical Reinforcement Learning with Trajectory Embeddings »
John Co-Reyes · Yu Xuan Liu · Abhishek Gupta · Benjamin Eysenbach · Pieter Abbeel · Sergey Levine -
2018 Poster: Universal Planning Networks: Learning Generalizable Representations for Visuomotor Control »
Aravind Srinivas · Allan Jabri · Pieter Abbeel · Sergey Levine · Chelsea Finn -
2018 Poster: Path Consistency Learning in Tsallis Entropy Regularized MDPs »
Yinlam Chow · Ofir Nachum · Mohammad Ghavamzadeh -
2018 Oral: Path Consistency Learning in Tsallis Entropy Regularized MDPs »
Yinlam Chow · Ofir Nachum · Mohammad Ghavamzadeh -
2018 Oral: Universal Planning Networks: Learning Generalizable Representations for Visuomotor Control »
Aravind Srinivas · Allan Jabri · Pieter Abbeel · Sergey Levine · Chelsea Finn -
2018 Oral: Automatic Goal Generation for Reinforcement Learning Agents »
Carlos Florensa · David Held · Xinyang Geng · Pieter Abbeel -
2018 Oral: Self-Consistent Trajectory Autoencoder: Hierarchical Reinforcement Learning with Trajectory Embeddings »
John Co-Reyes · Yu Xuan Liu · Abhishek Gupta · Benjamin Eysenbach · Pieter Abbeel · Sergey Levine -
2018 Oral: Latent Space Policies for Hierarchical Reinforcement Learning »
Tuomas Haarnoja · Kristian Hartikainen · Pieter Abbeel · Sergey Levine -
2017 Workshop: Reinforcement Learning Workshop »
Doina Precup · Balaraman Ravindran · Pierre-Luc Bacon -
2017 Poster: Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks »
Chelsea Finn · Pieter Abbeel · Sergey Levine -
2017 Poster: Prediction and Control with Temporal Segment Models »
Nikhil Mishra · Pieter Abbeel · Igor Mordatch -
2017 Poster: Reinforcement Learning with Deep Energy-Based Policies »
Tuomas Haarnoja · Haoran Tang · Pieter Abbeel · Sergey Levine -
2017 Poster: Constrained Policy Optimization »
Joshua Achiam · David Held · Aviv Tamar · Pieter Abbeel -
2017 Talk: Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks »
Chelsea Finn · Pieter Abbeel · Sergey Levine -
2017 Talk: Prediction and Control with Temporal Segment Models »
Nikhil Mishra · Pieter Abbeel · Igor Mordatch -
2017 Talk: Reinforcement Learning with Deep Energy-Based Policies »
Tuomas Haarnoja · Haoran Tang · Pieter Abbeel · Sergey Levine -
2017 Talk: Constrained Policy Optimization »
Joshua Achiam · David Held · Aviv Tamar · Pieter Abbeel