Timezone: »
Recently, researchers have demonstrated state-of-the-art performance on sequential prediction problems using deep neural networks and Reinforcement Learning (RL). For some of these problems, oracles that can demonstrate good performance may be available during training, but are not used by plain RL methods. To take advantage of this extra information, we propose AggreVaTeD, an extension of the Imitation Learning (IL) approach of Ross & Bagnell (2014). AggreVaTeD allows us to use expressive differentiable policy representations such as deep networks, while leveraging training-time oracles to achieve faster and more accurate solutions with less training data. Specifically, we present two gradient procedures that can learn neural network policies for several problems, including a sequential prediction task and several high-dimensional robotics control problems. We also provide a comprehensive theoretical study of IL that demonstrates that we can expect up to exponentially-lower sample complexity for learning with AggreVaTeD than with plain RL algorithms. Our results and theory indicate that IL (and AggreVaTeD in particular) can be a more effective strategy for sequential prediction than plain RL.
Author Information
Wen Sun (Carnegie Mellon University)
Arun Venkatraman (Carnegie Mellon University)
Geoff Gordon (Carnegie Mellon University)
Byron Boots (Georgia Tech)
Drew Bagnell (Carnegie Mellon University)
Related Events (a corresponding poster, oral, or spotlight)
-
2017 Talk: Deeply AggreVaTeD: Differentiable Imitation Learning for Sequential Prediction »
Wed Aug 9th 05:48 -- 06:06 AM Room C4.6 & C4.7
More from the Same Authors
-
2019 Workshop: Real-world Sequential Decision Making: Reinforcement Learning and Beyond »
Hoang Le · Yisong Yue · Adith Swaminathan · Byron Boots · Ching-An Cheng -
2019 Poster: Provably Efficient Imitation Learning from Observation Alone »
Wen Sun · Anirudh Vemula · Byron Boots · Drew Bagnell -
2019 Oral: Provably Efficient Imitation Learning from Observation Alone »
Wen Sun · Anirudh Vemula · Byron Boots · Drew Bagnell -
2019 Poster: Predictor-Corrector Policy Optimization »
Ching-An Cheng · Xinyan Yan · Nathan Ratliff · Byron Boots -
2019 Poster: Contextual Memory Trees »
Wen Sun · Alina Beygelzimer · Hal Daume · John Langford · Paul Mineiro -
2019 Poster: On Learning Invariant Representations for Domain Adaptation »
Han Zhao · Remi Tachet des Combes · Kun Zhang · Geoff Gordon -
2019 Oral: Predictor-Corrector Policy Optimization »
Ching-An Cheng · Xinyan Yan · Nathan Ratliff · Byron Boots -
2019 Oral: On Learning Invariant Representations for Domain Adaptation »
Han Zhao · Remi Tachet des Combes · Kun Zhang · Geoff Gordon -
2019 Oral: Contextual Memory Trees »
Wen Sun · Alina Beygelzimer · Hal Daume · John Langford · Paul Mineiro -
2018 Poster: Recurrent Predictive State Policy Networks »
Ahmed Hefny · Zita Marinho · Wen Sun · Siddhartha Srinivasa · Geoff Gordon -
2018 Oral: Recurrent Predictive State Policy Networks »
Ahmed Hefny · Zita Marinho · Wen Sun · Siddhartha Srinivasa · Geoff Gordon -
2017 Poster: Prediction under Uncertainty in Sparse Spectrum Gaussian Processes with Applications to Filtering and Control »
Yunpeng Pan · Xinyan Yan · Evangelos Theodorou · Byron Boots -
2017 Talk: Prediction under Uncertainty in Sparse Spectrum Gaussian Processes with Applications to Filtering and Control »
Yunpeng Pan · Xinyan Yan · Evangelos Theodorou · Byron Boots -
2017 Poster: Safety-Aware Algorithms for Adversarial Contextual Bandit »
Wen Sun · Debadeepta Dey · Ashish Kapoor -
2017 Talk: Safety-Aware Algorithms for Adversarial Contextual Bandit »
Wen Sun · Debadeepta Dey · Ashish Kapoor