Skip to yearly menu bar Skip to main content


Poster
in
Workshop: New Frontiers in Learning, Control, and Dynamical Systems

Limited Information Opponent Modeling

Yongliang Lv · Yan Zheng · jianye Hao


Abstract:

The goal of opponent modeling is to model the opponent policy to maximize the reward of the main agent. Most prior works fail to effectively handle scenarios where opponent information is limited. To this end, we propose a Limited Information Opponent Modeling (LIOM) approach that extracts opponent policy representations across episodes using only self-observations. LIOM introduces a novel policy-based data augmentation method that extracts opponent policy representations offline via contrastive learning and incorporates them as additional inputs for training a general response policy. During online testing, LIOM dynamically responds to opponent policies by extracting opponent policy representations from recent historical trajectory data and combining them with the general policy. Moreover, LIOM ensures a lower bound on expected rewards through a balance between conservative and exploitation. Experimental results demonstrate that LIOM is able to accurately extract opponent policy representations even when the opponent's information is limited, and has a certain degree of generalization ability for unknown policies, outperforming existing opponent modeling algorithms.

Chat is not available.