Learning in Bayesian Stackelberg Games With Unknown Follower's Types
Matteo Bollini ⋅ Francesco Bacchiocchi ⋅ Samuel Coutts ⋅ Matteo Castiglioni ⋅ Alberto Marchesi
Abstract
We study online learning in Bayesian Stackelberg games, where a leader repeatedly interacts with a follower whose unknown private type is independently drawn at each round from an unknown probability distribution. The goal is to design algorithms that minimize the leader's regret with respect to always playing an optimal commitment computed with knowledge of the game. We consider, for the first time to the best of our knowledge, the most realistic case in which the leader does not know anything about follower's types, i.e., the possible follower's payoffs. This raises considerable additional challenges compared to the usually addressed case in which follower's payoffs are known. First, we prove a strong negative result: no-regret is unattainable under action feedback, i.e., when the leader only observes the follower's best response at the end of each round. Thus, we focus on the easier type feedback model, where the follower's type is also revealed. In such a setting, we propose an algorithm that achieves a regret of $\widetilde{O}(\sqrt{T})$, ignoring the dependence on other parameters.
Successful Page Load