Timezone: »
We develop a new approach, named Greedy when Sure and Conservative when Uncertain (GSCU), to competing online against unknown and nonstationary opponents. GSCU improves in four aspects: 1) introduces a novel way of learning opponent policy embeddings offline; 2) trains offline a single best response (conditional additionally on our opponent policy embedding) instead of a finite set of separate best responses against any opponent; 3) computes online a posterior of the current opponent policy embedding, without making the discrete and ineffective decision which type the current opponent belongs to; and 4) selects online between a real-time greedy policy and a fixed conservative policy via an adversarial bandit algorithm, gaining a theoretically better regret than adhering to either. Experimental studies on popular benchmarks demonstrate GSCU's superiority over the state-of-the-art methods. The code is available online at \url{https://github.com/YeTianJHU/GSCU}.
Author Information
Haobo Fu (Tencent AI Lab)
Ye Tian (Tencent AI Lab)
Hongxiang Yu (SJTU)
Weiming Liu (University of Science and Technology of China)
Shuang Wu (Tencent)
Jiechao Xiong (Tencent AI Lab)
Ying Wen (Shanghai Jiao Tong University)
Kai Li (Institute of Automation, Chinese Academy of Sciences)
Junliang Xing (Tsinghua University)
Junliang Xing is currently a Professor at the Department of Computer Science and Technology, Tsinghua University. He received his dual B.E. degrees in Computer Science and Applied Mathematics from Xi'an Jiaotong University in 2007 and his Ph.D. degree in Computer Science and Technology from Tsinghua University in 2012. Dr. Xing has published over 120 peer-reviewed conference papers like IJCAI, AAAI, ICCV, CVPR, and journal papers like TPAMI, IJCV, AIJ. He has translated two books in computer vision and wrote one book on deep learning. His main research areas lie in computer vision and computer gaming, with a current focus on agent learning in complex decision-making problems.
Qiang Fu (Tencent AI Lab)
Wei Yang (Tencent AI Lab)
Related Events (a corresponding poster, oral, or spotlight)
-
2022 Poster: Greedy when Sure and Conservative when Uncertain about the Opponents »
Tue. Jul 19th through Wed the 20th Room #805
More from the Same Authors
-
2023 Poster: GEAR: A GPU-Centric Experience Replay System for Large Reinforcement Learning Models »
Hanjing Wang · Man-Kit Sit · Congjie He · Ying Wen · Weinan Zhang · Jun Wang · Yaodong Yang · Luo Mai -
2023 Poster: SDDM: Score-Decomposed Diffusion Models on Manifolds for Unpaired Image-to-Image Translation »
Shikun Sun · Longhui Wei · Junliang Xing · Jia Jia · Qi Tian -
2023 Poster: Future-conditioned Unsupervised Pretraining for Decision Transformer »
Zhihui Xie · Zichuan Lin · Deheng Ye · Qiang Fu · Wei Yang · Shuai Li -
2023 Poster: Cooperative Open-ended Learning Framework for Zero-Shot Coordination »
Yang Li · Shao Zhang · Jichen Sun · Yali Du · Ying Wen · Xinbing Wang · Wei Pan -
2023 Poster: Opponent-Limited Online Search for Imperfect Information Games »
Weiming Liu · Haobo Fu · Qiang Fu · Wei Yang -
2022 Poster: Equivalence Analysis between Counterfactual Regret Minimization and Online Mirror Descent »
Weiming Liu · Huacong Jiang · Bin Li · Houqiang Li -
2022 Spotlight: Equivalence Analysis between Counterfactual Regret Minimization and Online Mirror Descent »
Weiming Liu · Huacong Jiang · Bin Li · Houqiang Li -
2021 Poster: Modelling Behavioural Diversity for Learning in Open-Ended Games »
Nicolas Perez-Nieves · Yaodong Yang · Oliver Slumbers · David Mguni · Ying Wen · Jun Wang -
2021 Oral: Modelling Behavioural Diversity for Learning in Open-Ended Games »
Nicolas Perez-Nieves · Yaodong Yang · Oliver Slumbers · David Mguni · Ying Wen · Jun Wang -
2019 Poster: Grid-Wise Control for Multi-Agent Reinforcement Learning in Video Game AI »
Lei Han · Peng Sun · Yali Du · Jiechao Xiong · Qing Wang · Xinghai Sun · Han Liu · Tong Zhang -
2019 Oral: Grid-Wise Control for Multi-Agent Reinforcement Learning in Video Game AI »
Lei Han · Peng Sun · Yali Du · Jiechao Xiong · Qing Wang · Xinghai Sun · Han Liu · Tong Zhang