Timezone: »

EAT-C: Environment-Adversarial sub-Task Curriculum for Efficient Reinforcement Learning
Shuang Ao · Tianyi Zhou · Jing Jiang · Guodong Long · Xuan Song · Chengqi Zhang

Thu Jul 21 03:00 PM -- 05:00 PM (PDT) @ Hall E #926

Reinforcement learning (RL) is inefficient on long-horizon tasks due to sparse rewards and its policy can be fragile to slightly perturbed environments. We address these challenges via a curriculum of tasks with coupled environments, generated by two policies trained jointly with RL: (1) a co-operative planning policy recursively decomposing a hard task into a coarse-to-fine sub-task tree; and (2) an adversarial policy modifying the environment in each sub-task. They are complementary to acquire more informative feedback for RL: (1) provides dense reward of easier sub-tasks while (2) modifies sub-tasks' environments to be more challenging and diverse. Conversely, they are trained by RL's dense feedback on sub-tasks so their generated curriculum keeps adaptive to RL's progress. The sub-task tree enables an easy-to-hard curriculum for every policy: its top-down construction gradually increases sub-tasks the planner needs to generate, while the adversarial training between the environment and RL follows a bottom-up traversal that starts from a dense sequence of easier sub-tasks allowing more frequent environment changes. We compare EAT-C with RL/planning targeting similar problems and methods with environment generators or adversarial agents. Extensive experiments on diverse tasks demonstrate the advantages of our method on improving RL's efficiency and generalization.

Author Information

Shuang Ao (University of Technology Sydney)
Tianyi Zhou (University of Washington)
Jing Jiang (University of Technology Sydney)
Guodong Long (University of Technology Sydney)
Xuan Song
Chengqi Zhang (University of Technology Sydney)

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors