Timezone: »
When an agent trains for one target task, its experience is expected to be useful for training on another target task. This paper formulates the meta curriculum learning problem that builds a sequence of intermediate training tasks, called a curriculum, which will assist the learner to train toward any given target task in general. We propose a model-based meta automatic curriculum learning algorithm (MM-ACL) that learns to predict the performance on one task when trained on another, given contextual information such as the history of training tasks, loss functions, rollout state-action trajectories from the policy, etc. This predictor facilitates the generation of a curriculum that optimizes the performance of the learner on different target tasks. Our empirical results demonstrate that MM-ACL outperforms a random curriculum, a manually created curriculum, and a commonly used non-stationary bandit algorithm in a GridWorld domain.
Author Information
Zifan Xu (University of Texas at Austin)
Yulin Zhang (, University of Texas at Austin)
Shahaf Shperberg (University of Texas at Austin)
Reuth Mirsky (The University of Texas at Austin)
Yuqian Jiang (, University of Texas at Austin)
Bo Liu (University of Texas, Austin)
Peter Stone (The University of Texas at Austin and Sony AI)
More from the Same Authors
-
2021 : Reasoning about Human Behavior in Ad Hoc Teamwork »
Reuth Mirsky -
2022 : Task Factorization in Curriculum Learning »
Reuth Mirsky · Shahaf Shperberg · Yulin Zhang · Zifan Xu · Yuqian Jiang · Jiaxun Cui · Peter Stone -
2022 : Q/A: Invited Speaker: Peter Stone »
Peter Stone -
2022 : Invited Speaker: Peter Stone »
Peter Stone -
2022 Poster: Causal Dynamics Learning for Task-Independent State Abstraction »
Zizhao Wang · Xuesu Xiao · Zifan Xu · Yuke Zhu · Peter Stone -
2022 Oral: Causal Dynamics Learning for Task-Independent State Abstraction »
Zizhao Wang · Xuesu Xiao · Zifan Xu · Yuke Zhu · Peter Stone -
2021 Poster: Coach-Player Multi-agent Reinforcement Learning for Dynamic Team Composition »
Bo Liu · Qiang Liu · Peter Stone · Animesh Garg · Yuke Zhu · Anima Anandkumar -
2021 Oral: Coach-Player Multi-agent Reinforcement Learning for Dynamic Team Composition »
Bo Liu · Qiang Liu · Peter Stone · Animesh Garg · Yuke Zhu · Anima Anandkumar -
2020 Poster: Reducing Sampling Error in Batch Temporal Difference Learning »
Brahma Pavse · Ishan Durugkar · Josiah Hanna · Peter Stone -
2019 : Peter Stone: Learning Curricula for Transfer Learning in RL »
Peter Stone -
2019 : panel discussion with Craig Boutilier (Google Research), Emma Brunskill (Stanford), Chelsea Finn (Google Brain, Stanford, UC Berkeley), Mohammad Ghavamzadeh (Facebook AI), John Langford (Microsoft Research) and David Silver (Deepmind) »
Peter Stone · Craig Boutilier · Emma Brunskill · Chelsea Finn · John Langford · David Silver · Mohammad Ghavamzadeh -
2019 : Invited Talk 1: Adaptive Tolling for Multiagent Traffic Optimization »
Peter Stone -
2019 Poster: Importance Sampling Policy Evaluation with an Estimated Behavior Policy »
Josiah Hanna · Scott Niekum · Peter Stone -
2019 Oral: Importance Sampling Policy Evaluation with an Estimated Behavior Policy »
Josiah Hanna · Scott Niekum · Peter Stone -
2017 Poster: Data-Efficient Policy Evaluation Through Behavior Policy Search »
Josiah Hanna · Philip S. Thomas · Peter Stone · Scott Niekum -
2017 Talk: Data-Efficient Policy Evaluation Through Behavior Policy Search »
Josiah Hanna · Philip S. Thomas · Peter Stone · Scott Niekum