Timezone: »
Offline reinforcement learning has shown great promise in leveraging large pre-collected datasets for policy learning, allowing agents to forgo often-expensive online data collection. However, to date, offline reinforcement learning from visual observations with continuous action spaces has been relatively under-explored, and there is a lack of understanding of where the remaining challenges lie. In this paper, we seek to establish simple baselines for continuous control in the visual domain. We show that simple modifications to two state-of-the-art vision-based online reinforcement learning algorithms, DreamerV2 and DrQ-v2, suffice to outperform prior work and establish a competitive baseline. We rigorously evaluate these algorithms on both existing offline datasets and a new testbed for offline reinforcement learning from visual observations that better represents the data distributions present in real-world offline RL problems, and open-source our code and data to facilitate progress in this important domain. Finally, we present and analyze several key desiderata unique to offline RL from visual observations, including visual distractions and visually identifiable changes in dynamics.
Author Information
Cong Lu (University of Oxford)
Philip Ball (University of Oxford)
Tim G. J Rudner (University of Oxford)
Jack Parker-Holder (University of Oxford)
Michael A Osborne (U Oxford)
Yee-Whye Teh (Oxford and DeepMind)
More from the Same Authors
-
2021 : Continual Learning via Function-Space Variational Inference: A Unifying View »
Tim G. J. Rudner · Freddie Bickford Smith · Qixuan Feng · Yee-Whye Teh · Yarin Gal -
2021 : Attacking Graph Classification via Bayesian Optimisation »
Xingchen Wan · Henry Kenlay · Binxin Ru · Arno Blaas · Michael A Osborne · Xiaowen Dong -
2021 : Meta Learning MDPs with linear transition models »
Robert Müller · Aldo Pacchiano · Jack Parker-Holder -
2021 : Revisiting Design Choices in Offline Model Based Reinforcement Learning »
Cong Lu · Philip Ball · Jack Parker-Holder · Michael A Osborne · Stephen Roberts -
2022 : Plex: Towards Reliability using Pretrained Large Model Extensions »
Dustin Tran · Andreas Kirsch · Balaji Lakshminarayanan · Huiyi Hu · Du Phan · D. Sculley · Jasper Snoek · Jeremiah Liu · Jie Ren · Joost van Amersfoort · Kehang Han · E. Kelly Buchanan · Kevin Murphy · Mark Collier · Mike Dusenberry · Neil Band · Nithum Thain · Rodolphe Jenatton · Tim G. J Rudner · Yarin Gal · Zachary Nado · Zelda Mariet · Zi Wang · Zoubin Ghahramani -
2022 : Plex: Towards Reliability using Pretrained Large Model Extensions »
Dustin Tran · Andreas Kirsch · Balaji Lakshminarayanan · Huiyi Hu · Du Phan · D. Sculley · Jasper Snoek · Jeremiah Liu · JIE REN · Joost van Amersfoort · Kehang Han · Estefany Kelly Buchanan · Kevin Murphy · Mark Collier · Michael Dusenberry · Neil Band · Nithum Thain · Rodolphe Jenatton · Tim G. J Rudner · Yarin Gal · Zachary Nado · Zelda Mariet · Zi Wang · Zoubin Ghahramani -
2023 : Synthetic Experience Replay »
Cong Lu · Philip Ball · Yee-Whye Teh · Jack Parker-Holder -
2023 : SOBER: Highly Parallel Bayesian Optimization and Bayesian Quadrature over Discrete and Mixed Spaces »
Masaki Adachi · Satoshi Hayakawa · Saad Hamid · Martin Jørgensen · Harald Oberhauser · Michael A Osborne -
2023 Poster: Modality-Agnostic Variational Compression of Implicit Neural Representations »
Jonathan Richard Schwarz · Jihoon Tack · Yee-Whye Teh · Jaeho Lee · Jinwoo Shin -
2023 Poster: Learning Instance-Specific Augmentations by Capturing Local Invariances »
Ning Miao · Tom Rainforth · Emile Mathieu · Yann Dubois · Yee-Whye Teh · Adam Foster · Hyunjik Kim -
2023 Poster: Drug Discovery under Covariate Shift with Domain-Informed Prior Distributions over Functions »
Leo Klarner · Tim G. J. Rudner · Michael Reutlinger · Torsten Schindler · Garrett Morris · Charlotte Deane · Yee-Whye Teh -
2023 Poster: Efficient Online Reinforcement Learning with Offline Data »
Philip Ball · Laura Smith · Ilya Kostrikov · Sergey Levine -
2022 : Plex: Towards Reliability using Pretrained Large Model Extensions »
Dustin Tran · Andreas Kirsch · Balaji Lakshminarayanan · Huiyi Hu · Du Phan · D. Sculley · Jasper Snoek · Jeremiah Liu · JIE REN · Joost van Amersfoort · Kehang Han · Estefany Kelly Buchanan · Kevin Murphy · Mark Collier · Michael Dusenberry · Neil Band · Nithum Thain · Rodolphe Jenatton · Tim G. J Rudner · Yarin Gal · Zachary Nado · Zelda Mariet · Zi Wang · Zoubin Ghahramani -
2022 Poster: Evolving Curricula with Regret-Based Environment Design »
Jack Parker-Holder · Minqi Jiang · Michael Dennis · Mikayel Samvelyan · Jakob Foerster · Edward Grefenstette · Tim Rocktäschel -
2022 Poster: Continual Learning via Sequential Function-Space Variational Inference »
Tim G. J Rudner · Freddie Bickford Smith · QIXUAN FENG · Yee-Whye Teh · Yarin Gal -
2022 Spotlight: Evolving Curricula with Regret-Based Environment Design »
Jack Parker-Holder · Minqi Jiang · Michael Dennis · Mikayel Samvelyan · Jakob Foerster · Edward Grefenstette · Tim Rocktäschel -
2022 Spotlight: Continual Learning via Sequential Function-Space Variational Inference »
Tim G. J Rudner · Freddie Bickford Smith · QIXUAN FENG · Yee-Whye Teh · Yarin Gal -
2022 Poster: Robust Multi-Objective Bayesian Optimization Under Input Noise »
Samuel Daulton · Sait Cakmak · Maximilian Balandat · Michael A Osborne · Enlu Zhou · Eytan Bakshy -
2022 Spotlight: Robust Multi-Objective Bayesian Optimization Under Input Noise »
Samuel Daulton · Sait Cakmak · Maximilian Balandat · Michael A Osborne · Enlu Zhou · Eytan Bakshy -
2022 Poster: Stabilizing Off-Policy Deep Reinforcement Learning from Pixels »
Edoardo Cetin · Philip Ball · Stephen Roberts · Oya Celiktutan -
2022 Spotlight: Stabilizing Off-Policy Deep Reinforcement Learning from Pixels »
Edoardo Cetin · Philip Ball · Stephen Roberts · Oya Celiktutan -
2021 : Continual Learning via Function-Space Variational Inference: A Unifying View »
Yarin Gal · Yee-Whye Teh · Qixuan Feng · Freddie Bickford Smith · Tim G. J. Rudner -
2021 : Spotlight »
Zhiwei (Tony) Qin · Xianyuan Zhan · Meng Qi · Ruihan Yang · Philip Ball · Hamsa Bastani · Yao Liu · Xiuwen Wang · Haoran Xu · Tony Z. Zhao · Lili Chen · Aviral Kumar -
2021 Workshop: Challenges in Deploying and monitoring Machine Learning Systems »
Alessandra Tosi · Nathan Korda · Michael A Osborne · Stephen Roberts · Andrei Paleyes · Fariba Yousefi -
2021 Poster: Equivariant Learning of Stochastic Fields: Gaussian Processes and Steerable Conditional Neural Processes »
Peter Holderrieth · Michael Hutchinson · Yee-Whye Teh -
2021 Poster: Exploration in Approximate Hyper-State Space for Meta Reinforcement Learning »
Luisa Zintgraf · Leo Feng · Cong Lu · Maximilian Igl · Kristian Hartikainen · Katja Hofmann · Shimon Whiteson -
2021 Spotlight: Equivariant Learning of Stochastic Fields: Gaussian Processes and Steerable Conditional Neural Processes »
Peter Holderrieth · Michael Hutchinson · Yee-Whye Teh -
2021 Test Of Time: Bayesian Learning via Stochastic Gradient Langevin Dynamics »
Yee Teh · Max Welling -
2021 Spotlight: Exploration in Approximate Hyper-State Space for Meta Reinforcement Learning »
Luisa Zintgraf · Leo Feng · Cong Lu · Maximilian Igl · Kristian Hartikainen · Katja Hofmann · Shimon Whiteson -
2021 Poster: Think Global and Act Local: Bayesian Optimisation over High-Dimensional Categorical and Mixed Search Spaces »
Xingchen Wan · Vu Nguyen · Huong Ha · Binxin Ru · Cong Lu · Michael A Osborne -
2021 Poster: Optimal Transport Kernels for Sequential and Parallel Neural Architecture Search »
Vu Nguyen · Tam Le · Makoto Yamada · Michael A Osborne -
2021 Poster: Augmented World Models Facilitate Zero-Shot Dynamics Generalization From a Single Offline Environment »
Philip Ball · Cong Lu · Jack Parker-Holder · Stephen Roberts -
2021 Spotlight: Think Global and Act Local: Bayesian Optimisation over High-Dimensional Categorical and Mixed Search Spaces »
Xingchen Wan · Vu Nguyen · Huong Ha · Binxin Ru · Cong Lu · Michael A Osborne -
2021 Spotlight: Augmented World Models Facilitate Zero-Shot Dynamics Generalization From a Single Offline Environment »
Philip Ball · Cong Lu · Jack Parker-Holder · Stephen Roberts -
2021 Spotlight: Optimal Transport Kernels for Sequential and Parallel Neural Architecture Search »
Vu Nguyen · Tam Le · Makoto Yamada · Michael A Osborne -
2021 Poster: LieTransformer: Equivariant Self-Attention for Lie Groups »
Michael Hutchinson · Charline Le Lan · Sheheryar Zaidi · Emilien Dupont · Yee-Whye Teh · Hyunjik Kim -
2021 Spotlight: LieTransformer: Equivariant Self-Attention for Lie Groups »
Michael Hutchinson · Charline Le Lan · Sheheryar Zaidi · Emilien Dupont · Yee-Whye Teh · Hyunjik Kim -
2020 : Panel Discussion »
Neil Lawrence · Mihaela van der Schaar · Alex Smola · Valerio Perrone · Jack Parker-Holder · Zhengying Liu -
2020 : Contributed Talk 1: Provably Efficient Online Hyperparameter Optimization with Population-Based Bandits »
Jack Parker-Holder · Vu Nguyen · Stephen Roberts -
2020 : Spotlight talk 2 - Ridge Riding: Finding diverse solutions by following eigenvectors of the Hessian »
Jack Parker-Holder -
2020 Poster: Stochastic Flows and Geometric Optimization on the Orthogonal Group »
Krzysztof Choromanski · David Cheikhi · Jared Quincy Davis · Valerii Likhosherstov · Achille Nazaret · Achraf Bahamou · Xingyou Song · Mrugank Akarte · Jack Parker-Holder · Jacob Bergquist · Yuan Gao · Aldo Pacchiano · Tamas Sarlos · Adrian Weller · Vikas Sindhwani -
2020 Poster: MetaFun: Meta-Learning with Iterative Functional Updates »
Jin Xu · Jean-Francois Ton · Hyunjik Kim · Adam Kosiorek · Yee-Whye Teh -
2020 Poster: Divide, Conquer, and Combine: a New Inference Strategy for Probabilistic Programs with Stochastic Support »
Yuan Zhou · Hongseok Yang · Yee-Whye Teh · Tom Rainforth -
2020 Poster: Learning to Score Behaviors for Guided Policy Optimization »
Aldo Pacchiano · Jack Parker-Holder · Yunhao Tang · Krzysztof Choromanski · Anna Choromanska · Michael Jordan -
2020 Poster: Knowing The What But Not The Where in Bayesian Optimization »
Vu Nguyen · Michael A Osborne -
2020 Poster: Fractional Underdamped Langevin Dynamics: Retargeting SGD with Momentum under Heavy-Tailed Gradient Noise »
Umut Simsekli · Lingjiong Zhu · Yee-Whye Teh · Mert Gurbuzbalaban -
2020 Poster: Uncertainty Estimation Using a Single Deep Deterministic Neural Network »
Joost van Amersfoort · Lewis Smith · Yee-Whye Teh · Yarin Gal -
2020 Poster: Ready Policy One: World Building Through Active Learning »
Philip Ball · Jack Parker-Holder · Aldo Pacchiano · Krzysztof Choromanski · Stephen Roberts -
2020 Poster: Bayesian Optimisation over Multiple Continuous and Categorical Inputs »
Binxin Ru · Ahsan Alvi · Vu Nguyen · Michael A Osborne · Stephen Roberts -
2019 Poster: On the Limitations of Representing Functions on Sets »
Edward Wagstaff · Fabian Fuchs · Martin Engelcke · Ingmar Posner · Michael A Osborne -
2019 Oral: On the Limitations of Representing Functions on Sets »
Edward Wagstaff · Fabian Fuchs · Martin Engelcke · Ingmar Posner · Michael A Osborne -
2019 Oral: Hybrid Models with Deep and Invertible Features »
Eric Nalisnick · Akihiro Matsukawa · Yee-Whye Teh · Dilan Gorur · Balaji Lakshminarayanan -
2019 Poster: Automated Model Selection with Bayesian Quadrature »
Henry Chai · Jean-Francois Ton · Michael A Osborne · Roman Garnett -
2019 Poster: Disentangling Disentanglement in Variational Autoencoders »
Emile Mathieu · Tom Rainforth · N Siddharth · Yee-Whye Teh -
2019 Poster: AReS and MaRS - Adversarial and MMD-Minimizing Regression for SDEs »
Gabriele Abbati · Philippe Wenk · Michael A Osborne · Andreas Krause · Bernhard Schölkopf · Stefan Bauer -
2019 Poster: Asynchronous Batch Bayesian Optimisation with Improved Local Penalisation »
Ahsan Alvi · Binxin Ru · Jan-Peter Calliess · Stephen Roberts · Michael A Osborne -
2019 Poster: Hybrid Models with Deep and Invertible Features »
Eric Nalisnick · Akihiro Matsukawa · Yee-Whye Teh · Dilan Gorur · Balaji Lakshminarayanan -
2019 Oral: Automated Model Selection with Bayesian Quadrature »
Henry Chai · Jean-Francois Ton · Michael A Osborne · Roman Garnett -
2019 Oral: AReS and MaRS - Adversarial and MMD-Minimizing Regression for SDEs »
Gabriele Abbati · Philippe Wenk · Michael A Osborne · Andreas Krause · Bernhard Schölkopf · Stefan Bauer -
2019 Oral: Disentangling Disentanglement in Variational Autoencoders »
Emile Mathieu · Tom Rainforth · N Siddharth · Yee-Whye Teh -
2019 Oral: Asynchronous Batch Bayesian Optimisation with Improved Local Penalisation »
Ahsan Alvi · Binxin Ru · Jan-Peter Calliess · Stephen Roberts · Michael A Osborne -
2019 Poster: Set Transformer: A Framework for Attention-based Permutation-Invariant Neural Networks »
Juho Lee · Yoonho Lee · Jungtaek Kim · Adam Kosiorek · Seungjin Choi · Yee-Whye Teh -
2019 Poster: Fingerprint Policy Optimisation for Robust Reinforcement Learning »
Supratik Paul · Michael A Osborne · Shimon Whiteson -
2019 Oral: Set Transformer: A Framework for Attention-based Permutation-Invariant Neural Networks »
Juho Lee · Yoonho Lee · Jungtaek Kim · Adam Kosiorek · Seungjin Choi · Yee-Whye Teh -
2019 Oral: Fingerprint Policy Optimisation for Robust Reinforcement Learning »
Supratik Paul · Michael A Osborne · Shimon Whiteson -
2018 Poster: Progress & Compress: A scalable framework for continual learning »
Jonathan Richard Schwarz · Wojciech Czarnecki · Jelena Luketina · Agnieszka Grabska-Barwinska · Yee Teh · Razvan Pascanu · Raia Hadsell -
2018 Poster: Mix & Match - Agent Curricula for Reinforcement Learning »
Wojciech Czarnecki · Siddhant Jayakumar · Max Jaderberg · Leonard Hasenclever · Yee Teh · Nicolas Heess · Simon Osindero · Razvan Pascanu -
2018 Oral: Progress & Compress: A scalable framework for continual learning »
Jonathan Richard Schwarz · Wojciech Czarnecki · Jelena Luketina · Agnieszka Grabska-Barwinska · Yee Teh · Razvan Pascanu · Raia Hadsell -
2018 Oral: Mix & Match - Agent Curricula for Reinforcement Learning »
Wojciech Czarnecki · Siddhant Jayakumar · Max Jaderberg · Leonard Hasenclever · Yee Teh · Nicolas Heess · Simon Osindero · Razvan Pascanu -
2018 Poster: Fast Information-theoretic Bayesian Optimisation »
Binxin Ru · Michael A Osborne · Mark Mcleod · Diego Granziol -
2018 Poster: Optimization, fast and slow: optimally switching between local and Bayesian optimization »
Mark McLeod · Stephen Roberts · Michael A Osborne -
2018 Oral: Optimization, fast and slow: optimally switching between local and Bayesian optimization »
Mark McLeod · Stephen Roberts · Michael A Osborne -
2018 Oral: Fast Information-theoretic Bayesian Optimisation »
Binxin Ru · Michael A Osborne · Mark Mcleod · Diego Granziol -
2018 Poster: Conditional Neural Processes »
Marta Garnelo · Dan Rosenbaum · Chris Maddison · Tiago Ramalho · David Saxton · Murray Shanahan · Yee Teh · Danilo J. Rezende · S. M. Ali Eslami -
2018 Poster: Tighter Variational Bounds are Not Necessarily Better »
Tom Rainforth · Adam Kosiorek · Tuan Anh Le · Chris Maddison · Maximilian Igl · Frank Wood · Yee-Whye Teh -
2018 Oral: Tighter Variational Bounds are Not Necessarily Better »
Tom Rainforth · Adam Kosiorek · Tuan Anh Le · Chris Maddison · Maximilian Igl · Frank Wood · Yee-Whye Teh -
2018 Oral: Conditional Neural Processes »
Marta Garnelo · Dan Rosenbaum · Chris Maddison · Tiago Ramalho · David Saxton · Murray Shanahan · Yee Teh · Danilo J. Rezende · S. M. Ali Eslami