Timezone: »
Poster
Deep Reinforcement Learning with Smooth Policy
Qianli Shen · Yan Li · Haoming Jiang · Zhaoran Wang · Tuo Zhao
Thu Jul 16 06:00 AM -- 06:45 AM & Thu Jul 16 06:00 PM -- 06:45 PM (PDT) @
Deep reinforcement learning (RL) has achieved great empirical successes in various domains. However, the large search space of neural networks requires a large amount of data, which makes the current RL algorithms not sample efficient.
Motivated by the fact that many environments with continuous state space have smooth transitions, we propose to learn a smooth policy that behaves smoothly with respect to states. We develop a new framework --- \textbf{S}mooth \textbf{R}egularized \textbf{R}einforcement \textbf{L}earning ($\textbf{SR}^2\textbf{L}$), where the policy is trained with smoothness-inducing regularization. Such regularization effectively constrains the search space, and enforces smoothness in the learned policy. Moreover, our proposed framework can also improve the robustness of policy against measurement error in the state space, and can be naturally extended to distribubutionally robust setting. We apply the proposed framework to both on-policy (TRPO) and off-policy algorithm (DDPG). Through extensive experiments, we demonstrate that our method achieves improved sample efficiency and robustness.
Author Information
Qianli Shen (Peking University)
Yan Li (Georgia Tech)
Haoming Jiang (Georgia Tech)
Zhaoran Wang (Northwestern)
Tuo Zhao (Georgia Tech)
More from the Same Authors
-
2021 : Randomized Least Squares Policy Optimization »
Haque Ishfaq · Zhuoran Yang · Andrei Lupu · Viet Nguyen · Lewis Liu · Riashat Islam · Zhaoran Wang · Doina Precup -
2023 Poster: Score Approximation, Estimation and Distribution Recovery of Diffusion Models on Low-Dimensional Data »
Minshuo Chen · Kaixuan Huang · Tuo Zhao · Mengdi Wang -
2023 Poster: SMURF-THP: Score Matching-based UnceRtainty quantiFication for Transformer Hawkes Process »
Zichong Li · Yanbo Xu · Simiao Zuo · Haoming Jiang · Chao Zhang · Tuo Zhao · Hongyuan Zha -
2023 Poster: LoSparse: Structured Compression of Large Language Models based on Low-Rank and Sparse Approximation »
Yixiao Li · Yifan Yu · Qingru Zhang · Chen Liang · Pengcheng He · Weizhu Chen · Tuo Zhao -
2023 Poster: Effective Minkowski Dimension of Deep Nonparametric Regression: Function Approximation and Statistical Theories »
Zixuan Zhang · Minshuo Chen · Mengdi Wang · Wenjing Liao · Tuo Zhao -
2023 Poster: Machine Learning Force Fields with Data Cost Aware Training »
Alexander Bukharin · Tianyi Liu · Shengjie Wang · Simiao Zuo · Weihao Gao · Wen Yan · Tuo Zhao -
2023 Poster: Less is More: Task-aware Layer-wise Distillation for Language Model Compression »
Chen Liang · Simiao Zuo · Qingru Zhang · Pengcheng He · Weizhu Chen · Tuo Zhao -
2022 Poster: PLATON: Pruning Large Transformer Models with Upper Confidence Bound of Weight Importance »
Qingru Zhang · Simiao Zuo · Chen Liang · Alexander Bukharin · Pengcheng He · Weizhu Chen · Tuo Zhao -
2022 Poster: Benefits of Overparameterized Convolutional Residual Networks: Function Approximation under Smoothness Constraint »
Hao Liu · Minshuo Chen · Siawpeng Er · Wenjing Liao · Tong Zhang · Tuo Zhao -
2022 Spotlight: PLATON: Pruning Large Transformer Models with Upper Confidence Bound of Weight Importance »
Qingru Zhang · Simiao Zuo · Chen Liang · Alexander Bukharin · Pengcheng He · Weizhu Chen · Tuo Zhao -
2022 Spotlight: Benefits of Overparameterized Convolutional Residual Networks: Function Approximation under Smoothness Constraint »
Hao Liu · Minshuo Chen · Siawpeng Er · Wenjing Liao · Tong Zhang · Tuo Zhao -
2021 Poster: Besov Function Approximation and Binary Classification on Low-Dimensional Manifolds Using Convolutional Residual Networks »
Hao Liu · Minshuo Chen · Tuo Zhao · Wenjing Liao -
2021 Poster: How Important is the Train-Validation Split in Meta-Learning? »
Yu Bai · Minshuo Chen · Pan Zhou · Tuo Zhao · Jason Lee · Sham Kakade · Huan Wang · Caiming Xiong -
2021 Spotlight: Besov Function Approximation and Binary Classification on Low-Dimensional Manifolds Using Convolutional Residual Networks »
Hao Liu · Minshuo Chen · Tuo Zhao · Wenjing Liao -
2021 Spotlight: How Important is the Train-Validation Split in Meta-Learning? »
Yu Bai · Minshuo Chen · Pan Zhou · Tuo Zhao · Jason Lee · Sham Kakade · Huan Wang · Caiming Xiong -
2021 Poster: Global Convergence of Policy Gradient for Linear-Quadratic Mean-Field Control/Game in Continuous Time »
Weichen Wang · Jiequn Han · Zhuoran Yang · Zhaoran Wang -
2021 Spotlight: Global Convergence of Policy Gradient for Linear-Quadratic Mean-Field Control/Game in Continuous Time »
Weichen Wang · Jiequn Han · Zhuoran Yang · Zhaoran Wang -
2020 Poster: Transformer Hawkes Process »
Simiao Zuo · Haoming Jiang · Zichong Li · Tuo Zhao · Hongyuan Zha -
2020 Poster: Computational and Statistical Tradeoffs in Inferring Combinatorial Structures of Ising Model »
Ying Jin · Zhaoran Wang · Junwei Lu -
2019 Poster: On Scalable and Efficient Computation of Large Scale Optimal Transport »
Yujia Xie · Minshuo Chen · Haoming Jiang · Tuo Zhao · Hongyuan Zha -
2019 Oral: On Scalable and Efficient Computation of Large Scale Optimal Transport »
Yujia Xie · Minshuo Chen · Haoming Jiang · Tuo Zhao · Hongyuan Zha -
2019 Poster: Toward Understanding the Importance of Noise in Training Neural Networks »
Mo Zhou · Tianyi Liu · Yan Li · Dachao Lin · Enlu Zhou · Tuo Zhao -
2019 Oral: Toward Understanding the Importance of Noise in Training Neural Networks »
Mo Zhou · Tianyi Liu · Yan Li · Dachao Lin · Enlu Zhou · Tuo Zhao