Timezone: »
Reinforcement learning is typically concerned with learning control policies tailored to a particular agent. We investigate whether there exists a single policy that generalizes to controlling a wide variety of agent morphologies -- ones in which even dimensionality of state and action spaces changes. Such a policy would distill general and modular sensorimotor patterns that can be applied to control arbitrary agents. We propose a policy expressed as a collection of identical modular neural networks for each of the agent's actuators. Every module is only responsible for controlling its own actuator and receives information from its local sensors. In addition, messages are passed between modules, propagating information between distant modules. A single modular policy can successfully generate locomotion behaviors for over 20 planar agents with different skeletal structures such as monopod hoppers, quadrupeds, bipeds, and generalize to variants not seen during training -- a process that would normally require training and manual hyperparameter tuning for each morphology. We observe a wide variety of drastically diverse locomotion styles across morphologies as well as centralized coordination emerging via message passing between decentralized modules purely from the reinforcement learning objective. Video and code: https://huangwl18.github.io/modular-rl/
Author Information
Wenlong Huang (UC Berkeley)
Igor Mordatch (Google Brain)
Deepak Pathak (CMU, FAIR)
More from the Same Authors
-
2021 : Discovering and Achieving Goals with World Models »
Russell Mendonca · Oleh Rybkin · Kostas Daniilidis · Danijar Hafner · Deepak Pathak -
2021 : Decision Transformer: Reinforcement Learning via Sequence Modeling »
Lili Chen · Kevin Lu · Aravind Rajeswaran · Kimin Lee · Aditya Grover · Michael Laskin · Pieter Abbeel · Aravind Srinivas · Igor Mordatch -
2023 : Internet Explorer: Targeted Representation Learning on the Open Web »
Alexander Li · Ellis Brown · Alexei Efros · Deepak Pathak -
2023 : Your Diffusion Model is Secretly a Zero-Shot Classifier »
Alexander Li · Mihir Prabhudesai · Shivam Duggal · Ellis Brown · Deepak Pathak -
2023 : Test-time Adaptation with Diffusion Models »
Mihir Prabhudesai · Tsung-Wei Ke · Alexander Li · Deepak Pathak · Katerina Fragkiadaki -
2023 Poster: Efficient RL via Disentangled Environment and Agent Representations »
Kevin Gmelin · Shikhar Bahl · Russell Mendonca · Deepak Pathak -
2023 Oral: Efficient RL via Disentangled Environment and Agent Representations »
Kevin Gmelin · Shikhar Bahl · Russell Mendonca · Deepak Pathak -
2023 Poster: Internet Explorer: Targeted Representation Learning on the Open Web »
Alexander Li · Ellis Brown · Alexei Efros · Deepak Pathak -
2023 Poster: Test-time Adaptation with Slot-Centric Models »
Mihir Prabhudesai · Anirudh Goyal · Sujoy Paul · Sjoerd van Steenkiste · Mehdi S. M. Sajjadi · Gaurav Aggarwal · Thomas Kipf · Deepak Pathak · Katerina Fragkiadaki -
2022 Poster: Blocks Assemble! Learning to Assemble with Large-Scale Structured Reinforcement Learning »
Seyed Kamyar Seyed Ghasemipour · Satoshi Kataoka · Byron David · Daniel Freeman · Shixiang Gu · Igor Mordatch -
2022 Poster: Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents »
Wenlong Huang · Pieter Abbeel · Deepak Pathak · Igor Mordatch -
2022 Spotlight: Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents »
Wenlong Huang · Pieter Abbeel · Deepak Pathak · Igor Mordatch -
2022 Spotlight: Blocks Assemble! Learning to Assemble with Large-Scale Structured Reinforcement Learning »
Seyed Kamyar Seyed Ghasemipour · Satoshi Kataoka · Byron David · Daniel Freeman · Shixiang Gu · Igor Mordatch -
2022 Poster: Zero-Shot Reward Specification via Grounded Natural Language »
Parsa Mahmoudieh · Deepak Pathak · Trevor Darrell -
2022 Poster: REvolveR: Continuous Evolutionary Models for Robot-to-robot Policy Transfer »
Xingyu Liu · Deepak Pathak · Kris Kitani -
2022 Spotlight: Zero-Shot Reward Specification via Grounded Natural Language »
Parsa Mahmoudieh · Deepak Pathak · Trevor Darrell -
2022 Oral: REvolveR: Continuous Evolutionary Models for Robot-to-robot Policy Transfer »
Xingyu Liu · Deepak Pathak · Kris Kitani -
2022 Poster: Learning Iterative Reasoning through Energy Minimization »
Yilun Du · Shuang Li · Josh Tenenbaum · Igor Mordatch -
2022 Spotlight: Learning Iterative Reasoning through Energy Minimization »
Yilun Du · Shuang Li · Josh Tenenbaum · Igor Mordatch -
2021 : Oral Presentation: Discovering and Achieving Goals with World Models »
Oleh Rybkin · Deepak Pathak -
2021 Poster: Differentiable Spatial Planning using Transformers »
Devendra Singh Chaplot · Deepak Pathak · Jitendra Malik -
2021 Poster: Improved Contrastive Divergence Training of Energy-Based Models »
Yilun Du · Shuang Li · Josh Tenenbaum · Igor Mordatch -
2021 Spotlight: Improved Contrastive Divergence Training of Energy-Based Models »
Yilun Du · Shuang Li · Josh Tenenbaum · Igor Mordatch -
2021 Spotlight: Differentiable Spatial Planning using Transformers »
Devendra Singh Chaplot · Deepak Pathak · Jitendra Malik -
2021 Poster: Unsupervised Learning of Visual 3D Keypoints for Control »
Boyuan Chen · Pieter Abbeel · Deepak Pathak -
2021 Poster: Scalable Evaluation of Multi-Agent Reinforcement Learning with Melting Pot »
Joel Z Leibo · Edgar Duenez-Guzman · Alexander Vezhnevets · John Agapiou · Peter Sunehag · Raphael Koster · Jayd Matyas · Charles Beattie · Igor Mordatch · Thore Graepel -
2021 Poster: Model-Based Reinforcement Learning via Latent-Space Collocation »
Oleh Rybkin · Chuning Zhu · Anusha Nagabandi · Kostas Daniilidis · Igor Mordatch · Sergey Levine -
2021 Spotlight: Model-Based Reinforcement Learning via Latent-Space Collocation »
Oleh Rybkin · Chuning Zhu · Anusha Nagabandi · Kostas Daniilidis · Igor Mordatch · Sergey Levine -
2021 Spotlight: Unsupervised Learning of Visual 3D Keypoints for Control »
Boyuan Chen · Pieter Abbeel · Deepak Pathak -
2021 Oral: Scalable Evaluation of Multi-Agent Reinforcement Learning with Melting Pot »
Joel Z Leibo · Edgar Duenez-Guzman · Alexander Vezhnevets · John Agapiou · Peter Sunehag · Raphael Koster · Jayd Matyas · Charles Beattie · Igor Mordatch · Thore Graepel -
2020 : Energy-Based Models for Object-Oriented Learning »
Igor Mordatch -
2020 Poster: A Game Theoretic Framework for Model Based Reinforcement Learning »
Aravind Rajeswaran · Igor Mordatch · Vikash Kumar -
2020 Poster: Planning to Explore via Self-Supervised World Models »
Ramanan Sekar · Oleh Rybkin · Kostas Daniilidis · Pieter Abbeel · Danijar Hafner · Deepak Pathak -
2020 Tutorial: Model-Based Methods in Reinforcement Learning »
Igor Mordatch · Jessica Hamrick -
2019 Poster: Self-Supervised Exploration via Disagreement »
Deepak Pathak · Dhiraj Gandhi · Abhinav Gupta -
2019 Oral: Self-Supervised Exploration via Disagreement »
Deepak Pathak · Dhiraj Gandhi · Abhinav Gupta -
2018 Poster: Investigating Human Priors for Playing Video Games »
Rachit Dubey · Pulkit Agrawal · Deepak Pathak · Tom Griffiths · Alexei Efros -
2018 Oral: Investigating Human Priors for Playing Video Games »
Rachit Dubey · Pulkit Agrawal · Deepak Pathak · Tom Griffiths · Alexei Efros -
2017 Poster: Curiosity-driven Exploration by Self-supervised Prediction »
Deepak Pathak · Pulkit Agrawal · Alexei Efros · Trevor Darrell -
2017 Talk: Curiosity-driven Exploration by Self-supervised Prediction »
Deepak Pathak · Pulkit Agrawal · Alexei Efros · Trevor Darrell