Timezone: »
The performance of off-policy learning, including deep Q-learning and deep deterministic policy gradient (DDPG), critically depends on the choice of the exploration policy. Existing exploration methods are mostly based on adding noise to the on-going actor policy and can only explore \emph{local} regions close to what the actor policy dictates. In this work, we develop a simple meta-policy gradient algorithm that allows us to adaptively learn the exploration policy in DDPG. Our algorithm allows us to train flexible exploration behaviors that are independent of the actor policy, yielding a \emph{global exploration} that significantly speeds up the learning process. With an extensive study, we show that our method significantly improves the sample-efficiency of DDPG on a variety of reinforcement learning continuous control tasks.
Author Information
Tianbing Xu (Baidu Research, USA)
Qiang Liu (UT Austin)
Liang Zhao (Baidu Research USA)
Jian Peng (UIUC)
Related Events (a corresponding poster, oral, or spotlight)
-
2018 Oral: Learning to Explore via Meta-Policy Gradient »
Fri. Jul 13th 08:20 -- 08:30 AM Room A1
More from the Same Authors
-
2021 : Coordinate-wise Control Variates for Deep Policy Gradients »
Yuanyi Zhong · Yuan Zhou · Jian Peng -
2022 : Is Self-Supervised Contrastive Learning More Robust Than Supervised Learning? »
Yuanyi Zhong · Haoran Tang · Junkun Chen · Jian Peng · Yu-Xiong Wang -
2023 Poster: DecompDiff: Diffusion Models with Decomposed Priors for Structure-Based Drug Design »
Jiaqi Guan · Xiangxin Zhou · Yuwei Yang · Yu Bao · Jian Peng · Jianzhu Ma · Qiang Liu · Liang Wang · Quanquan Gu -
2022 Poster: Off-Policy Reinforcement Learning with Delayed Rewards »
Beining Han · Zhizhou Ren · Zuofan Wu · Yuan Zhou · Jian Peng -
2022 Spotlight: Off-Policy Reinforcement Learning with Delayed Rewards »
Beining Han · Zhizhou Ren · Zuofan Wu · Yuan Zhou · Jian Peng -
2022 Poster: Centroid Approximation for Bootstrap: Improving Particle Quality at Inference »
Mao Ye · Qiang Liu -
2022 Poster: How to Fill the Optimum Set? Population Gradient Descent with Harmless Diversity »
Chengyue Gong · · Qiang Liu -
2022 Poster: Proximal Exploration for Model-guided Protein Sequence Design »
Zhizhou Ren · Jiahan Li · Fan Ding · Yuan Zhou · Jianzhu Ma · Jian Peng -
2022 Poster: Pocket2Mol: Efficient Molecular Sampling Based on 3D Protein Pockets »
Xingang Peng · Shitong Luo · Jiaqi Guan · Qi Xie · Jian Peng · Jianzhu Ma -
2022 Spotlight: Pocket2Mol: Efficient Molecular Sampling Based on 3D Protein Pockets »
Xingang Peng · Shitong Luo · Jiaqi Guan · Qi Xie · Jian Peng · Jianzhu Ma -
2022 Spotlight: How to Fill the Optimum Set? Population Gradient Descent with Harmless Diversity »
Chengyue Gong · · Qiang Liu -
2022 Spotlight: Centroid Approximation for Bootstrap: Improving Particle Quality at Inference »
Mao Ye · Qiang Liu -
2022 Spotlight: Proximal Exploration for Model-guided Protein Sequence Design »
Zhizhou Ren · Jiahan Li · Fan Ding · Yuan Zhou · Jianzhu Ma · Jian Peng -
2022 Poster: A Langevin-like Sampler for Discrete Distributions »
Ruqi Zhang · Xingchao Liu · Qiang Liu -
2022 Spotlight: A Langevin-like Sampler for Discrete Distributions »
Ruqi Zhang · Xingchao Liu · Qiang Liu -
2021 Poster: AlphaNet: Improved Training of Supernets with Alpha-Divergence »
Dilin Wang · Chengyue Gong · Meng Li · Qiang Liu · Vikas Chandra -
2021 Oral: AlphaNet: Improved Training of Supernets with Alpha-Divergence »
Dilin Wang · Chengyue Gong · Meng Li · Qiang Liu · Vikas Chandra -
2021 Poster: Coach-Player Multi-agent Reinforcement Learning for Dynamic Team Composition »
Bo Liu · Qiang Liu · Peter Stone · Animesh Garg · Yuke Zhu · Anima Anandkumar -
2021 Oral: Coach-Player Multi-agent Reinforcement Learning for Dynamic Team Composition »
Bo Liu · Qiang Liu · Peter Stone · Animesh Garg · Yuke Zhu · Anima Anandkumar -
2020 Poster: Good Subnetworks Provably Exist: Pruning via Greedy Forward Selection »
Mao Ye · Chengyue Gong · Lizhen Nie · Denny Zhou · Adam Klivans · Qiang Liu -
2020 Poster: Go Wide, Then Narrow: Efficient Training of Deep Thin Networks »
Denny Zhou · Mao Ye · Chen Chen · Tianjian Meng · Mingxing Tan · Xiaodan Song · Quoc Le · Qiang Liu · Dale Schuurmans -
2020 Poster: Accountable Off-Policy Evaluation With Kernel Bellman Statistics »
Yihao Feng · Tongzheng Ren · Ziyang Tang · Qiang Liu -
2020 Poster: A Chance-Constrained Generative Framework for Sequence Optimization »
Xianggen Liu · Qiang Liu · Sen Song · Jian Peng -
2019 Workshop: Stein’s Method for Machine Learning and Statistics »
Francois-Xavier Briol · Lester Mackey · Chris Oates · Qiang Liu · Larry Goldstein · Larry Goldstein -
2019 Poster: Improving Neural Language Modeling via Adversarial Training »
Dilin Wang · Chengyue Gong · Qiang Liu -
2019 Oral: Improving Neural Language Modeling via Adversarial Training »
Dilin Wang · Chengyue Gong · Qiang Liu -
2019 Poster: Quantile Stein Variational Gradient Descent for Batch Bayesian Optimization »
Chengyue Gong · Jian Peng · Qiang Liu -
2019 Poster: Nonlinear Stein Variational Gradient Descent for Learning Diversified Mixture Models »
Dilin Wang · Qiang Liu -
2019 Poster: A Gradual, Semi-Discrete Approach to Generative Network Training via Explicit Wasserstein Minimization »
Yucheng Chen · Matus Telgarsky · Chao Zhang · Bolton Bailey · Daniel Hsu · Jian Peng -
2019 Oral: Quantile Stein Variational Gradient Descent for Batch Bayesian Optimization »
Chengyue Gong · Jian Peng · Qiang Liu -
2019 Oral: Nonlinear Stein Variational Gradient Descent for Learning Diversified Mixture Models »
Dilin Wang · Qiang Liu -
2019 Oral: A Gradual, Semi-Discrete Approach to Generative Network Training via Explicit Wasserstein Minimization »
Yucheng Chen · Matus Telgarsky · Chao Zhang · Bolton Bailey · Daniel Hsu · Jian Peng -
2018 Poster: Stein Variational Gradient Descent Without Gradient »
Jun Han · Qiang Liu -
2018 Oral: Stein Variational Gradient Descent Without Gradient »
Jun Han · Qiang Liu -
2018 Poster: Goodness-of-fit Testing for Discrete Distributions via Stein Discrepancy »
Jiasen Yang · Qiang Liu · Vinayak A Rao · Jennifer Neville -
2018 Poster: Stein Variational Message Passing for Continuous Graphical Models »
Dilin Wang · Zhe Zeng · Qiang Liu -
2018 Oral: Goodness-of-fit Testing for Discrete Distributions via Stein Discrepancy »
Jiasen Yang · Qiang Liu · Vinayak A Rao · Jennifer Neville -
2018 Oral: Stein Variational Message Passing for Continuous Graphical Models »
Dilin Wang · Zhe Zeng · Qiang Liu