Timezone: »
In recent years, Deep Reinforcement Learning has made impressive advances in solving several important benchmark problems for sequential decision making. Many control applications use a generic multilayer perceptron (MLP) for non-vision parts of the policy network. In this work, we propose a new neural network architecture for the policy network representation that is simple yet effective. The proposed Structured Control Net (SCN) splits the generic MLP into two separate sub-modules: a nonlinear control module and a linear control module. Intuitively, the nonlinear control is for forward-looking and global control, while the linear control stabilizes the local dynamics around the residual of global control. We hypothesize that this will bring together the benefits of both linear and nonlinear policies: improve training sample efficiency, final episodic reward, and generalization of learned policy, while requiring a smaller network and being generally applicable to different training methods. We validated our hypothesis with competitive results on simulations from OpenAI MuJoCo, Roboschool, Atari, and a custom urban driving environment, with various ablation and generalization tests, trained with multiple black-box and policy gradient training methods. The proposed architecture has the potential to improve upon broader control tasks by incorporating problem specific priors into the architecture. As a case study, we demonstrate much improved performance for locomotion tasks by emulating the biological central pattern generators (CPGs) as the nonlinear part of the architecture.
Author Information
Mario Srouji (Stanford University)
Presenting a long talk on Structured Control Nets for Deep Reinforcement Learning. Work was done when I was an intern at Apple AI Research. I am a rising Master’s student in computer science at Stanford University, and am currently interning at Nvidia Deep Learning Architecture.
Jian Zhang (Apple Inc.)
AI and Robotics. AI Research & Autonomous System Technologies at Apple
Ruslan Salakhutdinov (Carnegie Mellen University)
Related Events (a corresponding poster, oral, or spotlight)
-
2018 Oral: Structured Control Nets for Deep Reinforcement Learning »
Wed. Jul 11th 12:30 -- 12:50 PM Room A1
More from the Same Authors
-
2021 : Online Sub-Sampling for Reinforcement Learning with General Function Approximation »
Dingwen Kong · Ruslan Salakhutdinov · Ruosong Wang · Lin Yang -
2023 : Plan, Eliminate, and Track --- Language Models are Good Teachers for Embodied Agents. »
Yue Wu · So Yeon Min · Yonatan Bisk · Ruslan Salakhutdinov · Amos Azaria · Yuanzhi Li · Tom Mitchell · Shrimai Prabhumoye -
2023 : SPRING: Studying Papers and Reasoning to play Games »
Yue Wu · Shrimai Prabhumoye · So Yeon Min · Yonatan Bisk · Ruslan Salakhutdinov · Amos Azaria · Tom Mitchell · Yuanzhi Li -
2023 Poster: Graph Generative Model for Benchmarking Graph Neural Networks »
Minji Yoon · Yue Wu · John Palowitch · Bryan Perozzi · Ruslan Salakhutdinov -
2022 Poster: Recurrent Model-Free RL Can Be a Strong Baseline for Many POMDPs »
Tianwei Ni · Benjamin Eysenbach · Ruslan Salakhutdinov -
2022 Spotlight: Recurrent Model-Free RL Can Be a Strong Baseline for Many POMDPs »
Tianwei Ni · Benjamin Eysenbach · Ruslan Salakhutdinov -
2021 Poster: Towards Understanding and Mitigating Social Biases in Language Models »
Paul Liang · Chiyu Wu · LP Morency · Ruslan Salakhutdinov -
2021 Poster: Reasoning Over Virtual Knowledge Bases With Open Predicate Relations »
Haitian Sun · Patrick Verga · Bhuwan Dhingra · Ruslan Salakhutdinov · William Cohen -
2021 Spotlight: Reasoning Over Virtual Knowledge Bases With Open Predicate Relations »
Haitian Sun · Patrick Verga · Bhuwan Dhingra · Ruslan Salakhutdinov · William Cohen -
2021 Spotlight: Towards Understanding and Mitigating Social Biases in Language Models »
Paul Liang · Chiyu Wu · LP Morency · Ruslan Salakhutdinov -
2021 Poster: Instabilities of Offline RL with Pre-Trained Neural Representation »
Ruosong Wang · Yifan Wu · Ruslan Salakhutdinov · Sham Kakade -
2021 Spotlight: Instabilities of Offline RL with Pre-Trained Neural Representation »
Ruosong Wang · Yifan Wu · Ruslan Salakhutdinov · Sham Kakade -
2021 Poster: Information Obfuscation of Graph Neural Networks »
Peiyuan Liao · Han Zhao · Keyulu Xu · Tommi Jaakkola · Geoff Gordon · Stefanie Jegelka · Ruslan Salakhutdinov -
2021 Poster: Uncertainty Weighted Actor-Critic for Offline Reinforcement Learning »
Yue Wu · Shuangfei Zhai · Nitish Srivastava · Joshua M Susskind · Jian Zhang · Ruslan Salakhutdinov · Hanlin Goh -
2021 Poster: On Proximal Policy Optimization's Heavy-tailed Gradients »
Saurabh Garg · Joshua Zhanson · Emilio Parisotto · Adarsh Prasad · Zico Kolter · Zachary Lipton · Sivaraman Balakrishnan · Ruslan Salakhutdinov · Pradeep Ravikumar -
2021 Spotlight: On Proximal Policy Optimization's Heavy-tailed Gradients »
Saurabh Garg · Joshua Zhanson · Emilio Parisotto · Adarsh Prasad · Zico Kolter · Zachary Lipton · Sivaraman Balakrishnan · Ruslan Salakhutdinov · Pradeep Ravikumar -
2021 Spotlight: Information Obfuscation of Graph Neural Networks »
Peiyuan Liao · Han Zhao · Keyulu Xu · Tommi Jaakkola · Geoff Gordon · Stefanie Jegelka · Ruslan Salakhutdinov -
2021 Spotlight: Uncertainty Weighted Actor-Critic for Offline Reinforcement Learning »
Yue Wu · Shuangfei Zhai · Nitish Srivastava · Joshua M Susskind · Jian Zhang · Ruslan Salakhutdinov · Hanlin Goh -
2020 Workshop: Workshop on Learning in Artificial Open Worlds »
Arthur Szlam · Katja Hofmann · Ruslan Salakhutdinov · Noboru Kuno · William Guss · Kavya Srinet · Brandon Houghton -
2020 Workshop: Bridge Between Perception and Reasoning: Graph Neural Networks & Beyond »
Jian Tang · Le Song · Jure Leskovec · Renjie Liao · Yujia Li · Sanja Fidler · Richard Zemel · Ruslan Salakhutdinov -
2019 Talk: Opening Remarks »
Kamalika Chaudhuri · Ruslan Salakhutdinov -
2018 Poster: Transformation Autoregressive Networks »
Junier Oliva · Kumar Avinava Dubey · Manzil Zaheer · Barnabás Póczos · Ruslan Salakhutdinov · Eric Xing · Jeff Schneider -
2018 Oral: Transformation Autoregressive Networks »
Junier Oliva · Kumar Avinava Dubey · Manzil Zaheer · Barnabás Póczos · Ruslan Salakhutdinov · Eric Xing · Jeff Schneider -
2018 Poster: Gated Path Planning Networks »
Lisa Lee · Emilio Parisotto · Devendra Singh Chaplot · Eric Xing · Ruslan Salakhutdinov -
2018 Oral: Gated Path Planning Networks »
Lisa Lee · Emilio Parisotto · Devendra Singh Chaplot · Eric Xing · Ruslan Salakhutdinov -
2017 Poster: Toward Controlled Generation of Text »
Zhiting Hu · Zichao Yang · Xiaodan Liang · Ruslan Salakhutdinov · Eric Xing -
2017 Poster: Improved Variational Autoencoders for Text Modeling using Dilated Convolutions »
Zichao Yang · Zhiting Hu · Ruslan Salakhutdinov · Taylor Berg-Kirkpatrick -
2017 Talk: Improved Variational Autoencoders for Text Modeling using Dilated Convolutions »
Zichao Yang · Zhiting Hu · Ruslan Salakhutdinov · Taylor Berg-Kirkpatrick -
2017 Talk: Toward Controlled Generation of Text »
Zhiting Hu · Zichao Yang · Xiaodan Liang · Ruslan Salakhutdinov · Eric Xing