Timezone: »
Poster
Asynchronous Coagent Networks
James Kostas · Chris Nota · Philip Thomas
Coagent policy gradient algorithms (CPGAs) are reinforcement learning algorithms for training a class of stochastic neural networks called coagent networks. In this work, we prove that CPGAs converge to locally optimal policies. Additionally, we extend prior theory to encompass asynchronous and recurrent coagent networks. These extensions facilitate the straightforward design and analysis of hierarchical reinforcement learning algorithms like the option-critic, and eliminate the need for complex derivations of customized learning rules for these algorithms.
Author Information
James Kostas (University of Massachusetts Amherst)
Chris Nota (University of Massachusetts Amherst)
Philip Thomas (University of Massachusetts Amherst)
More from the Same Authors
-
2021 Spotlight: Towards Practical Mean Bounds for Small Samples »
My Phan · Philip Thomas · Erik Learned-Miller -
2021 Poster: Towards Practical Mean Bounds for Small Samples »
My Phan · Philip Thomas · Erik Learned-Miller -
2021 Poster: Posterior Value Functions: Hindsight Baselines for Policy Gradient Methods »
Chris Nota · Philip Thomas · Bruno C. da Silva -
2021 Spotlight: Posterior Value Functions: Hindsight Baselines for Policy Gradient Methods »
Chris Nota · Philip Thomas · Bruno C. da Silva -
2021 Poster: High Confidence Generalization for Reinforcement Learning »
James Kostas · Yash Chandak · Scott Jordan · Georgios Theocharous · Philip Thomas -
2021 Spotlight: High Confidence Generalization for Reinforcement Learning »
James Kostas · Yash Chandak · Scott Jordan · Georgios Theocharous · Philip Thomas -
2020 Poster: Evaluating the Performance of Reinforcement Learning Algorithms »
Scott Jordan · Yash Chandak · Daniel Cohen · Mengxue Zhang · Philip Thomas -
2020 Poster: Optimizing for the Future in Non-Stationary MDPs »
Yash Chandak · Georgios Theocharous · Shiv Shankar · Martha White · Sridhar Mahadevan · Philip Thomas -
2019 Poster: Concentration Inequalities for Conditional Value at Risk »
Philip Thomas · Erik Learned-Miller -
2019 Oral: Concentration Inequalities for Conditional Value at Risk »
Philip Thomas · Erik Learned-Miller -
2019 Poster: Learning Action Representations for Reinforcement Learning »
Yash Chandak · Georgios Theocharous · James Kostas · Scott Jordan · Philip Thomas -
2019 Oral: Learning Action Representations for Reinforcement Learning »
Yash Chandak · Georgios Theocharous · James Kostas · Scott Jordan · Philip Thomas -
2018 Poster: Decoupling Gradient-Like Learning Rules from Representations »
Philip Thomas · Christoph Dann · Emma Brunskill -
2018 Oral: Decoupling Gradient-Like Learning Rules from Representations »
Philip Thomas · Christoph Dann · Emma Brunskill