Timezone: »

 
Talk
Hierarchy Through Composition with Multitask LMDPs
Andrew Saxe · Adam Earle · Benjamin Rosman

Mon Aug 07 12:51 AM -- 01:09 AM (PDT) @ C4.5

Hierarchical architectures are critical to the scalability of reinforcement learning methods. Most current hierarchical frameworks execute actions serially, with macro-actions comprising sequences of primitive actions. We propose a novel alternative to these control hierarchies based on concurrent execution of many actions in parallel. Our scheme exploits the guaranteed concurrent compositionality provided by the linearly solvable Markov decision process (LMDP) framework, which naturally enables a learning agent to draw on several macro-actions simultaneously to solve new tasks. We introduce the Multitask LMDP module, which maintains a parallel distributed representation of tasks and may be stacked to form deep hierarchies abstracted in space and time.

Author Information

Andrew Saxe (Harvard University)
Adam Earle (University of the Witwatersrand)

I am currently a PhD candidate at the University of the Witwatersrand, South Africa. My interests lie in the intersection of deep learning and reinforcement learning. My current research focuses on the principled development of hierarchy and abstraction.

Benjamin Rosman (Council for Scientific and Industrial Research (CSIR))

Benjamin Rosman received a Ph.D. degree in Informatics from the University of Edinburgh in 2014. Previously, he obtained an M.Sc. in Artificial Intelligence from the University of Edinburgh, a Bachelor of Science (Honours) in Computer Science from the University of the Witwatersrand, South Africa, and a Bachelor of Science (Honours) in Computational and Applied Mathematics, also from the University of the Witwatersrand. He is an Associate Professor in the School of Computer Science and Applied Mathematics at the University of the Witwatersrand. He is the Chair of the IEEE South African joint chapter of Control Systems, and Robotics and Automation. Prof. Rosman’s research interests focus primarily on learning and decision making in autonomous systems, in particular studying how learning can be accelerated through abstracting and generalising knowledge gained from solving previous problems. He additionally works in the area of skill and behaviour learning for robots.

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors