Timezone: »
This paper investigates how to weight imperfect expert demonstrations for generative adversarial imitation learning (GAIL). The agent is expected to perform behaviors demonstrated by experts. But in many applications, experts could also make mistakes and their demonstrations would mislead or slow the learning process of the agent. Recently, existing methods for imitation learning from imperfect demonstrations mostly focus on using the preference or confidence scores to distinguish imperfect demonstrations. However, these auxiliary information needs to be collected with the help of an oracle, which is usually hard and expensive to afford in practice. In contrast, this paper proposes a method of learning to weight imperfect demonstrations in GAIL without imposing extensive prior information. We provide a rigorous mathematical analysis, presenting that the weights of demonstrations can be exactly determined by combining the discriminator and agent policy in GAIL. Theoretical analysis suggests that with the estimated weights the agent can learn a better policy beyond those plain expert demonstrations. Experiments in the Mujoco and Atari environments demonstrate that the proposed algorithm outperforms baseline methods in handling imperfect expert demonstrations.
Author Information
Yunke Wang (Wuhan University)
Chang Xu (University of Sydney)
Bo Du (Wuhan University)
Honglak Lee (Google / U. Michigan)
Related Events (a corresponding poster, oral, or spotlight)
-
2021 Poster: Learning to Weight Imperfect Demonstrations »
Wed. Jul 21st 04:00 -- 06:00 AM Room
More from the Same Authors
-
2021 : Learning Action Translator for Meta Reinforcement Learning on Sparse-Reward Tasks »
Yijie Guo · Qiucheng Wu · Honglak Lee -
2023 Poster: Go Beyond Imagination: Maximizing Episodic Reachability with World Models »
Yao Fu · Run Peng · Honglak Lee -
2023 Poster: Bidirectional Looking with A Novel Double Exponential Moving Average to Adaptive and Non-adaptive Momentum Optimizers »
Yineng Chen · Zuchao Li · Lefei Zhang · Bo Du · hai zhao -
2023 Poster: Dual Focal Loss for Calibration »
Linwei Tao · Minjing Dong · Chang Xu -
2023 Poster: PixelAsParam: A Gradient View on Diffusion Sampling with Guidance »
Anh-Dung Dinh · Daochang Liu · Chang Xu -
2022 Poster: Spatial-Channel Token Distillation for Vision MLPs »
Yanxi Li · Xinghao Chen · Minjing Dong · Yehui Tang · Yunhe Wang · Chang Xu -
2022 Spotlight: Spatial-Channel Token Distillation for Vision MLPs »
Yanxi Li · Xinghao Chen · Minjing Dong · Yehui Tang · Yunhe Wang · Chang Xu -
2021 Poster: Commutative Lie Group VAE for Disentanglement Learning »
Xinqi Zhu · Chang Xu · Dacheng Tao -
2021 Oral: Commutative Lie Group VAE for Disentanglement Learning »
Xinqi Zhu · Chang Xu · Dacheng Tao -
2021 Poster: Shortest-Path Constrained Reinforcement Learning for Sparse Reward Tasks »
Sungryull Sohn · Sungtae Lee · Jongwook Choi · Harm van Seijen · Mehdi Fatemi · Honglak Lee -
2021 Poster: Variational Empowerment as Representation Learning for Goal-Conditioned Reinforcement Learning »
Jongwook Choi · Archit Sharma · Honglak Lee · Sergey Levine · Shixiang Gu -
2021 Poster: K-shot NAS: Learnable Weight-Sharing for NAS with K-shot Supernets »
Xiu Su · Shan You · Mingkai Zheng · Fei Wang · Chen Qian · Changshui Zhang · Chang Xu -
2021 Spotlight: K-shot NAS: Learnable Weight-Sharing for NAS with K-shot Supernets »
Xiu Su · Shan You · Mingkai Zheng · Fei Wang · Chen Qian · Changshui Zhang · Chang Xu -
2021 Spotlight: Variational Empowerment as Representation Learning for Goal-Conditioned Reinforcement Learning »
Jongwook Choi · Archit Sharma · Honglak Lee · Sergey Levine · Shixiang Gu -
2021 Spotlight: Shortest-Path Constrained Reinforcement Learning for Sparse Reward Tasks »
Sungryull Sohn · Sungtae Lee · Jongwook Choi · Harm van Seijen · Mehdi Fatemi · Honglak Lee -
2021 Poster: State Entropy Maximization with Random Encoders for Efficient Exploration »
Younggyo Seo · Lili Chen · Jinwoo Shin · Honglak Lee · Pieter Abbeel · Kimin Lee -
2021 Spotlight: State Entropy Maximization with Random Encoders for Efficient Exploration »
Younggyo Seo · Lili Chen · Jinwoo Shin · Honglak Lee · Pieter Abbeel · Kimin Lee -
2020 Poster: Context-aware Dynamics Model for Generalization in Model-Based Reinforcement Learning »
Kimin Lee · Younggyo Seo · Seunghyun Lee · Honglak Lee · Jinwoo Shin -
2020 Poster: Neural Architecture Search in A Proxy Validation Loss Landscape »
Yanxi Li · Minjing Dong · Yunhe Wang · Chang Xu -
2020 Poster: Training Binary Neural Networks through Learning with Noisy Supervision »
Kai Han · Yunhe Wang · Yixing Xu · Chunjing Xu · Enhua Wu · Chang Xu -
2019 Poster: Learning Latent Dynamics for Planning from Pixels »
Danijar Hafner · Timothy Lillicrap · Ian Fischer · Ruben Villegas · David Ha · Honglak Lee · James Davidson -
2019 Poster: Robust Inference via Generative Classifiers for Handling Noisy Labels »
Kimin Lee · Sukmin Yun · Kibok Lee · Honglak Lee · Bo Li · Jinwoo Shin -
2019 Poster: Similarity of Neural Network Representations Revisited »
Simon Kornblith · Mohammad Norouzi · Honglak Lee · Geoffrey Hinton -
2019 Oral: Similarity of Neural Network Representations Revisited »
Simon Kornblith · Mohammad Norouzi · Honglak Lee · Geoffrey Hinton -
2019 Oral: Robust Inference via Generative Classifiers for Handling Noisy Labels »
Kimin Lee · Sukmin Yun · Kibok Lee · Honglak Lee · Bo Li · Jinwoo Shin -
2019 Oral: Learning Latent Dynamics for Planning from Pixels »
Danijar Hafner · Timothy Lillicrap · Ian Fischer · Ruben Villegas · David Ha · Honglak Lee · James Davidson -
2019 Poster: LegoNet: Efficient Convolutional Neural Networks with Lego Filters »
Zhaohui Yang · Yunhe Wang · Chuanjian Liu · Hanting Chen · Chunjing Xu · Boxin Shi · Chao Xu · Chang Xu -
2019 Oral: LegoNet: Efficient Convolutional Neural Networks with Lego Filters »
Zhaohui Yang · Yunhe Wang · Chuanjian Liu · Hanting Chen · Chunjing Xu · Boxin Shi · Chao Xu · Chang Xu -
2018 Poster: Self-Imitation Learning »
Junhyuk Oh · Yijie Guo · Satinder Singh · Honglak Lee -
2018 Oral: Self-Imitation Learning »
Junhyuk Oh · Yijie Guo · Satinder Singh · Honglak Lee -
2018 Poster: Hierarchical Long-term Video Prediction without Supervision »
Nevan Wichers · Ruben Villegas · Dumitru Erhan · Honglak Lee -
2018 Oral: Hierarchical Long-term Video Prediction without Supervision »
Nevan Wichers · Ruben Villegas · Dumitru Erhan · Honglak Lee -
2017 Poster: Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning »
Junhyuk Oh · Satinder Singh · Honglak Lee · Pushmeet Kohli -
2017 Talk: Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning »
Junhyuk Oh · Satinder Singh · Honglak Lee · Pushmeet Kohli -
2017 Poster: Learning to Generate Long-term Future via Hierarchical Prediction »
Ruben Villegas · Jimei Yang · Yuliang Zou · Sungryull Sohn · Xunyu Lin · Honglak Lee -
2017 Talk: Learning to Generate Long-term Future via Hierarchical Prediction »
Ruben Villegas · Jimei Yang · Yuliang Zou · Sungryull Sohn · Xunyu Lin · Honglak Lee