Timezone: »
Recent work has shown that offline reinforcement learning (RL) can be formulated as a sequence modeling problem~\cite{chen2021decision, janner2021offline} and solved via approaches similar to large-scale language modeling. However, any practical instantiation of RL also involves an online component, where policies pretrained on passive offline datasets are finetuned via task-specific interactions with the environment. We propose Online Decision Transformers (ODT), an RL algorithm based on sequence modeling that blends offline pretraining with online finetuning in a unified framework. Our framework uses sequence-level entropy regularizers in conjunction with autoregressive modeling objectives for sample-efficient exploration and finetuning. Empirically, we show that ODT is competitive with the state-of-the-art in absolute performance on the D4RL benchmark but shows much more significant gains during the finetuning procedure.
Author Information
Qinqing Zheng (Meta AI Research)
Amy Zhang (FAIR / UC Berkeley)
Aditya Grover (UCLA)
Related Events (a corresponding poster, oral, or spotlight)
-
2022 Poster: Online Decision Transformer »
Thu. Jul 21st through Fri the 22nd Room Hall E #1023
More from the Same Authors
-
2020 : Learning Invariant Representations for Reinforcement Learning without Reconstruction »
Amy Zhang -
2020 : Multi-Task Reinforcement Learning as a Hidden-Parameter Block MDP »
Amy Zhang -
2021 : Learning from an Exploring Demonstrator: Optimal Reward Estimation for Bandits »
Wenshuo Guo · Kumar Agrawal · Aditya Grover · Vidya Muthukumar · Ashwin Pananjady -
2021 : Decision Transformer: Reinforcement Learning via Sequence Modeling »
Lili Chen · Kevin Lu · Aravind Rajeswaran · Kimin Lee · Aditya Grover · Michael Laskin · Pieter Abbeel · Aravind Srinivas · Igor Mordatch -
2021 : Decision Transformer: Reinforcement Learning via Sequence Modeling »
Lili Chen · Kevin Lu · Aravind Rajeswaran · Kimin Lee · Aditya Grover · Michael Laskin · Pieter Abbeel · Aravind Srinivas · Igor Mordatch -
2023 : Conditional Bisimulation for Generalization in Reinforcement Learning »
Anuj Mahajan · Amy Zhang -
2023 Poster: Optimal Goal-Reaching Reinforcement Learning via Quasimetric Learning »
Tongzhou Wang · Antonio Torralba · Phillip Isola · Amy Zhang -
2023 Poster: LIV: Language-Image Representations and Rewards for Robotic Control »
Yecheng Jason Ma · Vikash Kumar · Amy Zhang · Osbert Bastani · Dinesh Jayaraman -
2023 Poster: Semi-Supervised Offline Reinforcement Learning with Action-Free Trajectories »
Qinqing Zheng · Mikael Henaff · Brandon Amos · Aditya Grover -
2022 : Invited talks 3, Q/A, Amy, Rich and Liting »
Liting Sun · Amy Zhang · Richard Zemel -
2022 : Invited talks 3, Amy Zhang, Rich Zemel and Liting Sun »
Amy Zhang · Richard Zemel · Liting Sun -
2022 Poster: Robust Policy Learning over Multiple Uncertainty Sets »
Annie Xie · Shagun Sodhani · Chelsea Finn · Joelle Pineau · Amy Zhang -
2022 Poster: Bisimulation Makes Analogies in Goal-Conditioned Reinforcement Learning »
Philippe Hansen-Estruch · Amy Zhang · Ashvin Nair · Patrick Yin · Sergey Levine -
2022 Spotlight: Robust Policy Learning over Multiple Uncertainty Sets »
Annie Xie · Shagun Sodhani · Chelsea Finn · Joelle Pineau · Amy Zhang -
2022 Spotlight: Bisimulation Makes Analogies in Goal-Conditioned Reinforcement Learning »
Philippe Hansen-Estruch · Amy Zhang · Ashvin Nair · Patrick Yin · Sergey Levine -
2022 Poster: Denoised MDPs: Learning World Models Better Than the World Itself »
Tongzhou Wang · Simon Du · Antonio Torralba · Phillip Isola · Amy Zhang · Yuandong Tian -
2022 Spotlight: Denoised MDPs: Learning World Models Better Than the World Itself »
Tongzhou Wang · Simon Du · Antonio Torralba · Phillip Isola · Amy Zhang · Yuandong Tian -
2022 Poster: Matching Normalizing Flows and Probability Paths on Manifolds »
Heli Ben-Hamu · samuel cohen · Joey Bose · Brandon Amos · Maximilian Nickel · Aditya Grover · Ricky T. Q. Chen · Yaron Lipman -
2022 Poster: Transformer Neural Processes: Uncertainty-Aware Meta Learning Via Sequence Modeling »
Tung Nguyen · Aditya Grover -
2022 Spotlight: Matching Normalizing Flows and Probability Paths on Manifolds »
Heli Ben-Hamu · samuel cohen · Joey Bose · Brandon Amos · Maximilian Nickel · Aditya Grover · Ricky T. Q. Chen · Yaron Lipman -
2022 Spotlight: Transformer Neural Processes: Uncertainty-Aware Meta Learning Via Sequence Modeling »
Tung Nguyen · Aditya Grover -
2021 Poster: Near-Optimal Confidence Sequences for Bounded Random Variables »
Arun Kuchibhotla · Qinqing Zheng -
2021 Spotlight: Near-Optimal Confidence Sequences for Bounded Random Variables »
Arun Kuchibhotla · Qinqing Zheng -
2020 : Paper spotlight: Learning Invariant Representations for Reinforcement Learning without Reconstruction »
Amy Zhang -
2020 Poster: Sharp Composition Bounds for Gaussian Differential Privacy via Edgeworth Expansion »
Qinqing Zheng · Jinshuo Dong · Qi Long · Weijie Su -
2020 Poster: Fair Generative Modeling via Weak Supervision »
Kristy Choi · Aditya Grover · Trisha Singh · Rui Shu · Stefano Ermon -
2019 Poster: Graphite: Iterative Generative Modeling of Graphs »
Aditya Grover · Aaron Zweig · Stefano Ermon -
2019 Oral: Graphite: Iterative Generative Modeling of Graphs »
Aditya Grover · Aaron Zweig · Stefano Ermon -
2019 Poster: Neural Joint Source-Channel Coding »
Kristy Choi · Kedar Tatwawadi · Aditya Grover · Tsachy Weissman · Stefano Ermon -
2019 Oral: Neural Joint Source-Channel Coding »
Kristy Choi · Kedar Tatwawadi · Aditya Grover · Tsachy Weissman · Stefano Ermon -
2018 Poster: Modeling Sparse Deviations for Compressed Sensing using Generative Models »
Manik Dhar · Aditya Grover · Stefano Ermon -
2018 Oral: Modeling Sparse Deviations for Compressed Sensing using Generative Models »
Manik Dhar · Aditya Grover · Stefano Ermon -
2018 Poster: Composable Planning with Attributes »
Amy Zhang · Sainbayar Sukhbaatar · Adam Lerer · Arthur Szlam · Facebook Rob Fergus -
2018 Poster: Learning Policy Representations in Multiagent Systems »
Aditya Grover · Maruan Al-Shedivat · Jayesh K. Gupta · Yura Burda · Harrison Edwards -
2018 Oral: Composable Planning with Attributes »
Amy Zhang · Sainbayar Sukhbaatar · Adam Lerer · Arthur Szlam · Facebook Rob Fergus -
2018 Oral: Learning Policy Representations in Multiagent Systems »
Aditya Grover · Maruan Al-Shedivat · Jayesh K. Gupta · Yura Burda · Harrison Edwards