Timezone: »
Though deep reinforcement learning (DRL) has obtained substantial success, it may encounter catastrophic failures due to the intrinsic uncertainty caused by stochastic policies and environment variability. To address this issue, we propose a novel reinforcement learning framework of CVaR-Proximal-Policy-Optimization (CPPO) by rating the conditional value-at-risk (CVaR) as an assessment for risk. We show that performance degradation under observation state disturbance and transition probability disturbance theoretically depends on the range of disturbance as well as the gap of value function between different states. Therefore, constraining the value function among states with CVaR can improve the robustness of the policy. Experimental results show that CPPO achieves higher cumulative reward and exhibits stronger robustness against observation state disturbance and transition probability disturbance in environment dynamics among a series of continuous control tasks in MuJoCo.
Author Information
Chengyang Ying (Tsinghua University, Tsinghua University)
Xinning Zhou (Tsinghua University)
Dong Yan (Tsinghua University)
Jun Zhu (Tsinghua University)
More from the Same Authors
-
2021 : Strategically-timed State-Observation Attacks on Deep Reinforcement Learning Agents »
Xinning Zhou · You Qiaoben · Chengyang Ying · Jun Zhu -
2021 : Adversarial Semantic Contour for Object Detection »
Yichi Zhang · Zijian Zhu · Xiao Yang · Jun Zhu -
2021 : Query-based Adversarial Attacks on Graph with Fake Nodes »
Zhengyi Wang · Zhongkai Hao · Jun Zhu -
2023 Poster: MultiAdam: Parameter-wise Scale-invariant Optimizer for Physics-informed Neural Network »
Jiachen Yao · Chang Su · Zhongkai Hao · LIU SONGMING · Hang Su · Jun Zhu -
2023 Poster: NUNO: A General Framework for Learning Parametric PDEs with Non-Uniform Data »
LIU SONGMING · Zhongkai Hao · Chengyang Ying · Hang Su · Ze Cheng · Jun Zhu -
2023 Poster: Improved Techniques for Maximum Likelihood Estimation for Diffusion ODEs »
Kaiwen Zheng · Cheng Lu · Jianfei Chen · Jun Zhu -
2023 Poster: GNOT: A General Neural Operator Transformer for Operator Learning »
Zhongkai Hao · Zhengyi Wang · Hang Su · Chengyang Ying · Yinpeng Dong · LIU SONGMING · Ze Cheng · Jian Song · Jun Zhu -
2023 Poster: Contrastive Energy Prediction for Exact Energy-Guided Diffusion Sampling in Offline Reinforcement Learning »
Cheng Lu · Huayu Chen · Jianfei Chen · Hang Su · Chongxuan Li · Jun Zhu -
2023 Poster: One Transformer Fits All Distributions in Multi-Modal Diffusion at Scale »
Fan Bao · Shen Nie · Kaiwen Xue · Chongxuan Li · Shi Pu · Yaole Wang · Gang Yue · Yue Cao · Hang Su · Jun Zhu -
2023 Poster: Stabilizing GANs' Training with Brownian Motion Controller »
Tianjiao Luo · Ziyu Zhu · Jianfei Chen · Jun Zhu -
2023 Poster: Revisiting Discriminative vs. Generative Classifiers: Theory and Implications »
Chenyu Zheng · Guoqiang Wu · Fan Bao · Yue Cao · Chongxuan Li · Jun Zhu -
2022 Poster: NeuralEF: Deconstructing Kernels by Deep Neural Networks »
Zhijie Deng · Jiaxin Shi · Jun Zhu -
2022 Spotlight: NeuralEF: Deconstructing Kernels by Deep Neural Networks »
Zhijie Deng · Jiaxin Shi · Jun Zhu -
2022 Poster: Robustness and Accuracy Could Be Reconcilable by (Proper) Definition »
Tianyu Pang · Min Lin · Xiao Yang · Jun Zhu · Shuicheng Yan -
2022 Poster: Fast Lossless Neural Compression with Integer-Only Discrete Flows »
Siyu Wang · Jianfei Chen · Chongxuan Li · Jun Zhu · Bo Zhang -
2022 Spotlight: Fast Lossless Neural Compression with Integer-Only Discrete Flows »
Siyu Wang · Jianfei Chen · Chongxuan Li · Jun Zhu · Bo Zhang -
2022 Spotlight: Robustness and Accuracy Could Be Reconcilable by (Proper) Definition »
Tianyu Pang · Min Lin · Xiao Yang · Jun Zhu · Shuicheng Yan -
2022 Poster: Thompson Sampling for (Combinatorial) Pure Exploration »
Siwei Wang · Jun Zhu -
2022 Spotlight: Thompson Sampling for (Combinatorial) Pure Exploration »
Siwei Wang · Jun Zhu -
2021 Poster: Variational (Gradient) Estimate of the Score Function in Energy-based Latent Variable Models »
Fan Bao · Kun Xu · Chongxuan Li · Lanqing Hong · Jun Zhu · Bo Zhang -
2021 Spotlight: Variational (Gradient) Estimate of the Score Function in Energy-based Latent Variable Models »
Fan Bao · Kun Xu · Chongxuan Li · Lanqing Hong · Jun Zhu · Bo Zhang -
2020 Poster: Understanding and Stabilizing GANs' Training Dynamics Using Control Theory »
Kun Xu · Chongxuan Li · Jun Zhu · Bo Zhang -
2020 Poster: Variance Reduction and Quasi-Newton for Particle-Based Variational Inference »
Michael Zhu · Chang Liu · Jun Zhu -
2020 Poster: VFlow: More Expressive Generative Flows with Variational Data Augmentation »
Jianfei Chen · Cheng Lu · Biqi Chenli · Jun Zhu · Tian Tian -
2020 Poster: Nonparametric Score Estimators »
Yuhao Zhou · Jiaxin Shi · Jun Zhu -
2019 Poster: Improving Adversarial Robustness via Promoting Ensemble Diversity »
Tianyu Pang · Kun Xu · Chao Du · Ning Chen · Jun Zhu -
2019 Oral: Improving Adversarial Robustness via Promoting Ensemble Diversity »
Tianyu Pang · Kun Xu · Chao Du · Ning Chen · Jun Zhu -
2018 Poster: Message Passing Stein Variational Gradient Descent »
Jingwei Zhuo · Chang Liu · Jiaxin Shi · Jun Zhu · Ning Chen · Bo Zhang -
2018 Poster: Racing Thompson: an Efficient Algorithm for Thompson Sampling with Non-conjugate Priors »
Yichi Zhou · Jun Zhu · Jingwei Zhuo -
2018 Oral: Message Passing Stein Variational Gradient Descent »
Jingwei Zhuo · Chang Liu · Jiaxin Shi · Jun Zhu · Ning Chen · Bo Zhang -
2018 Oral: Racing Thompson: an Efficient Algorithm for Thompson Sampling with Non-conjugate Priors »
Yichi Zhou · Jun Zhu · Jingwei Zhuo -
2018 Poster: Max-Mahalanobis Linear Discriminant Analysis Networks »
Tianyu Pang · Chao Du · Jun Zhu -
2018 Poster: Adversarial Attack on Graph Structured Data »
Hanjun Dai · Hui Li · Tian Tian · Xin Huang · Lin Wang · Jun Zhu · Le Song -
2018 Oral: Max-Mahalanobis Linear Discriminant Analysis Networks »
Tianyu Pang · Chao Du · Jun Zhu -
2018 Oral: Adversarial Attack on Graph Structured Data »
Hanjun Dai · Hui Li · Tian Tian · Xin Huang · Lin Wang · Jun Zhu · Le Song -
2018 Poster: Stochastic Training of Graph Convolutional Networks with Variance Reduction »
Jianfei Chen · Jun Zhu · Le Song -
2018 Poster: A Spectral Approach to Gradient Estimation for Implicit Distributions »
Jiaxin Shi · Shengyang Sun · Jun Zhu -
2018 Oral: A Spectral Approach to Gradient Estimation for Implicit Distributions »
Jiaxin Shi · Shengyang Sun · Jun Zhu -
2018 Oral: Stochastic Training of Graph Convolutional Networks with Variance Reduction »
Jianfei Chen · Jun Zhu · Le Song -
2017 Poster: Identify the Nash Equilibrium in Static Games with Random Payoffs »
Yichi Zhou · Jialian Li · Jun Zhu -
2017 Talk: Identify the Nash Equilibrium in Static Games with Random Payoffs »
Yichi Zhou · Jialian Li · Jun Zhu