Timezone: »
Reinforcement learning (RL) is inefficient on long-horizon tasks due to sparse rewards and its policy can be fragile to slightly perturbed environments. We address these challenges via a curriculum of tasks with coupled environments, generated by two policies trained jointly with RL: (1) a co-operative planning policy recursively decomposing a hard task into a coarse-to-fine sub-task tree; and (2) an adversarial policy modifying the environment in each sub-task. They are complementary to acquire more informative feedback for RL: (1) provides dense reward of easier sub-tasks while (2) modifies sub-tasks' environments to be more challenging and diverse. Conversely, they are trained by RL's dense feedback on sub-tasks so their generated curriculum keeps adaptive to RL's progress. The sub-task tree enables an easy-to-hard curriculum for every policy: its top-down construction gradually increases sub-tasks the planner needs to generate, while the adversarial training between the environment and RL follows a bottom-up traversal that starts from a dense sequence of easier sub-tasks allowing more frequent environment changes. We compare EAT-C with RL/planning targeting similar problems and methods with environment generators or adversarial agents. Extensive experiments on diverse tasks demonstrate the advantages of our method on improving RL's efficiency and generalization.
Author Information
Shuang Ao (University of Technology Sydney)
Tianyi Zhou (University of Washington)
Jing Jiang (University of Technology Sydney)
Guodong Long (University of Technology Sydney)
Xuan Song
Chengqi Zhang (University of Technology Sydney)
Related Events (a corresponding poster, oral, or spotlight)
-
2022 Spotlight: EAT-C: Environment-Adversarial sub-Task Curriculum for Efficient Reinforcement Learning »
Thu. Jul 21st 06:40 -- 06:45 PM Room Room 318 - 320
More from the Same Authors
-
2022 : Vote for Nearest Neighbors Meta-Pruning of Self-Supervised Networks »
Haiyan Zhao · Tianyi Zhou · Guodong Long · Jing Jiang · Chengqi Zhang -
2022 : Federated Learning from Pre-Trained Models: A Contrastive Learning Approach »
Yue Tan · Yue Tan · Guodong Long · Guodong Long · Jie Ma · Jie Ma · LU LIU · LU LIU · Tianyi Zhou · Tianyi Zhou · Jing Jiang · Jing Jiang -
2023 Poster: Continual Task Allocation in Meta-Policy Network via Sparse Prompting »
Yijun Yang · Tianyi Zhou · Jing Jiang · Guodong Long · Yuhui Shi -
2023 Poster: Does Continual Learning Equally Forget All Parameters? »
Haiyan Zhao · Tianyi Zhou · Guodong Long · Jing Jiang · Chengqi Zhang -
2022 : Vote for Nearest Neighbors Meta-Pruning of Self-Supervised Networks »
Haiyan Zhao · Tianyi Zhou · Guodong Long · Jing Jiang · Chengqi Zhang -
2022 : Does Continual Learning Equally Forget All Parameters? »
Haiyan Zhao · Tianyi Zhou · Guodong Long · Jing Jiang · Chengqi Zhang -
2022 Poster: Identity-Disentangled Adversarial Augmentation for Self-supervised Learning »
Kaiwen Yang · Tianyi Zhou · Xinmei Tian · Dacheng Tao -
2022 Spotlight: Identity-Disentangled Adversarial Augmentation for Self-supervised Learning »
Kaiwen Yang · Tianyi Zhou · Xinmei Tian · Dacheng Tao