Hierarchical architectures are critical to the scalability of reinforcement learning methods. Most current hierarchical frameworks execute actions serially, with macro-actions comprising sequences of primitive actions. We propose a novel alternative to these control hierarchies based on concurrent execution of many actions in parallel. Our scheme exploits the guaranteed concurrent compositionality provided by the linearly solvable Markov decision process (LMDP) framework, which naturally enables a learning agent to draw on several macro-actions simultaneously to solve new tasks. We introduce the Multitask LMDP module, which maintains a parallel distributed representation of tasks and may be stacked to form deep hierarchies abstracted in space and time.
Andrew Saxe (Harvard University)
Adam Earle (University of the Witwatersrand)
I am currently a PhD candidate at the University of the Witwatersrand, South Africa. My interests lie in the intersection of deep learning and reinforcement learning. My current research focuses on the principled development of hierarchy and abstraction.
Benjamin Rosman (Council for Scientific and Industrial Research (CSIR))
Benjamin Rosman received a Ph.D. degree in Informatics from the University of Edinburgh in 2014. Previously, he obtained an M.Sc. in Artificial Intelligence from the University of Edinburgh, a Bachelor of Science (Honours) in Computer Science from the University of the Witwatersrand, South Africa, and a Bachelor of Science (Honours) in Computational and Applied Mathematics, also from the University of the Witwatersrand. He is an Associate Professor in the School of Computer Science and Applied Mathematics at the University of the Witwatersrand. He is the Chair of the IEEE South African joint chapter of Control Systems, and Robotics and Automation. Prof. Rosman’s research interests focus primarily on learning and decision making in autonomous systems, in particular studying how learning can be accelerated through abstracting and generalising knowledge gained from solving previous problems. He additionally works in the area of skill and behaviour learning for robots.
Related Events (a corresponding poster, oral, or spotlight)
2017 Talk: Hierarchy Through Composition with Multitask LMDPs »
Mon. Aug 7th 07:51 -- 08:09 AM Room C4.5
More from the Same Authors
2022 : Just-in-Time Sparsity: Learning Dynamic Sparsity Schedules »
· Chiratidzo Matowe · Arnu Pretorius · Benjamin Rosman · Sara Hooker
2020 Poster: Learning Portable Representations for High-Level Planning »
Steven James · Benjamin Rosman · George Konidaris
2019 : Benjamin Rosman: Exploiting Structure For Accelerating Reinforcement Learning »