Timezone: »

Asynchronous Coagent Networks
James Kostas · Chris Nota · Philip Thomas

Tue Jul 14 07:00 AM -- 07:45 AM & Tue Jul 14 06:00 PM -- 06:45 PM (PDT) @

Coagent policy gradient algorithms (CPGAs) are reinforcement learning algorithms for training a class of stochastic neural networks called coagent networks. In this work, we prove that CPGAs converge to locally optimal policies. Additionally, we extend prior theory to encompass asynchronous and recurrent coagent networks. These extensions facilitate the straightforward design and analysis of hierarchical reinforcement learning algorithms like the option-critic, and eliminate the need for complex derivations of customized learning rules for these algorithms.

Author Information

James Kostas (University of Massachusetts Amherst)
Chris Nota (University of Massachusetts Amherst)
Philip Thomas (University of Massachusetts Amherst)

More from the Same Authors