Timezone: »
In lifelong reinforcement learning, agents must effectively transfer knowledge across tasks while simultaneously addressing exploration, credit assignment, and generalization. State abstraction can help overcome these hurdles by compressing the representation used by an agent, thereby reducing the computational and statistical burdens of learning. To this end, we here develop theory to compute and use state abstractions in lifelong reinforcement learning. We introduce two new classes of abstractions: (1) transitive state abstractions, whose optimal form can be computed efficiently, and (2) PAC state abstractions, which are guaranteed to hold with respect to a distribution of tasks. We show that the joint family of transitive PAC abstractions can be acquired efficiently, preserve near optimal-behavior, and experimentally reduce sample complexity in simple domains, thereby yielding a family of desirable abstractions for use in lifelong reinforcement learning. Along with these positive results, we show that there are pathological cases where state abstractions can negatively impact performance.
Author Information
David Abel (Brown University)
Dilip S. Arumugam (Stanford University)
Lucas Lehnert (Brown University)
Michael L. Littman (Brown University)
Related Events (a corresponding poster, oral, or spotlight)
-
2018 Oral: State Abstractions for Lifelong Reinforcement Learning »
Fri. Jul 13th 09:40 -- 09:50 AM Room A1
More from the Same Authors
-
2021 : Bad-Policy Density: A Measure of Reinforcement-Learning Hardness »
David Abel · Cameron Allen · Dilip Arumugam · D Ellis Hershkowitz · Michael L. Littman · Lawson Wong -
2021 : Convergence of a Human-in-the-Loop Policy-Gradient Algorithm With Eligibility Trace Under Reward, Policy, and Advantage Feedback »
Ishaan Shah · David Halpern · Michael L. Littman · Kavosh Asadi -
2023 : Specifying Behavior Preference with Tiered Reward Functions »
Zhiyuan Zhou · Henry Sowerby · Michael L. Littman -
2023 Poster: Meta-learning Parameterized Skills »
Haotian Fu · Shangqun Yu · Saket Tiwari · Michael L. Littman · George Konidaris -
2021 : Bad-Policy Density: A Measure of Reinforcement-Learning Hardness »
David Abel · Cameron Allen · Dilip Arumugam · D Ellis Hershkowitz · Michael L. Littman · Lawson Wong -
2019 Poster: Finding Options that Minimize Planning Time »
Yuu Jinnai · David Abel · David Hershkowitz · Michael L. Littman · George Konidaris -
2019 Oral: Finding Options that Minimize Planning Time »
Yuu Jinnai · David Abel · David Hershkowitz · Michael L. Littman · George Konidaris -
2019 Poster: Discovering Options for Exploration by Minimizing Cover Time »
Yuu Jinnai · Jee Won Park · David Abel · George Konidaris -
2019 Oral: Discovering Options for Exploration by Minimizing Cover Time »
Yuu Jinnai · Jee Won Park · David Abel · George Konidaris -
2018 Poster: Policy and Value Transfer in Lifelong Reinforcement Learning »
David Abel · Yuu Jinnai · Sophie Guo · George Konidaris · Michael L. Littman -
2018 Oral: Policy and Value Transfer in Lifelong Reinforcement Learning »
David Abel · Yuu Jinnai · Sophie Guo · George Konidaris · Michael L. Littman -
2018 Poster: Lipschitz Continuity in Model-based Reinforcement Learning »
Kavosh Asadi · Dipendra Misra · Michael L. Littman -
2018 Oral: Lipschitz Continuity in Model-based Reinforcement Learning »
Kavosh Asadi · Dipendra Misra · Michael L. Littman -
2017 : Transfer learning using successor state features »
Lucas Lehnert -
2017 Poster: An Alternative Softmax Operator for Reinforcement Learning »
Kavosh Asadi · Michael L. Littman -
2017 Poster: Interactive Learning from Policy-Dependent Human Feedback »
James MacGlashan · Mark Ho · Robert Loftin · Bei Peng · Guan Wang · David L Roberts · Matthew E. Taylor · Michael L. Littman -
2017 Talk: Interactive Learning from Policy-Dependent Human Feedback »
James MacGlashan · Mark Ho · Robert Loftin · Bei Peng · Guan Wang · David L Roberts · Matthew E. Taylor · Michael L. Littman -
2017 Talk: An Alternative Softmax Operator for Reinforcement Learning »
Kavosh Asadi · Michael L. Littman