Timezone: »

 
Spotlight
Greedy when Sure and Conservative when Uncertain about the Opponents
Haobo Fu · Ye Tian · Hongxiang Yu · Weiming Liu · Shuang Wu · Jiechao Xiong · Ying Wen · Kai Li · Junliang Xing · Qiang Fu · Wei Yang

Tue Jul 19 02:20 PM -- 02:25 PM (PDT) @ Room 318 - 320

We develop a new approach, named Greedy when Sure and Conservative when Uncertain (GSCU), to competing online against unknown and nonstationary opponents. GSCU improves in four aspects: 1) introduces a novel way of learning opponent policy embeddings offline; 2) trains offline a single best response (conditional additionally on our opponent policy embedding) instead of a finite set of separate best responses against any opponent; 3) computes online a posterior of the current opponent policy embedding, without making the discrete and ineffective decision which type the current opponent belongs to; and 4) selects online between a real-time greedy policy and a fixed conservative policy via an adversarial bandit algorithm, gaining a theoretically better regret than adhering to either. Experimental studies on popular benchmarks demonstrate GSCU's superiority over the state-of-the-art methods. The code is available online at \url{https://github.com/YeTianJHU/GSCU}.

Author Information

Haobo Fu (Tencent AI Lab)
Ye Tian (Tencent AI Lab)
Hongxiang Yu (SJTU)
Weiming Liu (University of Science and Technology of China)
Shuang Wu (Tencent)
Jiechao Xiong (Tencent AI Lab)
Ying Wen (Shanghai Jiao Tong University)
Kai Li (Institute of Automation, Chinese Academy of Sciences)
Junliang Xing (Tsinghua University)
Junliang Xing

Junliang Xing is currently a Professor at the Department of Computer Science and Technology, Tsinghua University. He received his dual B.E. degrees in Computer Science and Applied Mathematics from Xi'an Jiaotong University in 2007 and his Ph.D. degree in Computer Science and Technology from Tsinghua University in 2012. Dr. Xing has published over 120 peer-reviewed conference papers like IJCAI, AAAI, ICCV, CVPR, and journal papers like TPAMI, IJCV, AIJ. He has translated two books in computer vision and wrote one book on deep learning. His main research areas lie in computer vision and computer gaming, with a current focus on agent learning in complex decision-making problems.

Qiang Fu (Tencent AI Lab)
Wei Yang (Tencent AI Lab)

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors