Skip to yearly menu bar Skip to main content

Workshop: Workshop on Reinforcement Learning Theory

Learning Stackelberg Equilibria in Sequential Price Mechanisms

Gianluca Brero


We study the problem of the design of simple economic mechanisms for assigning items to self-interested agents that combine a messaging round with a sequential-pricing stage. The rules of the sequential-pricing stage and in particular the way these rules use messages determines the way the messaging stage is used. This is a Stackelberg game where the designer is the leader and fixes the mechanism rules, inducing an equilibrium amongst agents (the followers). We model the followers through equilibrium play coming from no-regret learning, and introduce a novel single-agent Stackelberg MDP formulation, where the leader learns to effect a follower equilibrium that optimizes its objective. We solve this MDP using actor-critic methods, where the critic is given access to the joint information of all the agents.

Chat is not available.